From wk at gnupg.org Thu Dec 1 09:46:30 2016 From: wk at gnupg.org (Werner Koch) Date: Thu, 01 Dec 2016 09:46:30 +0100 Subject: Howto implement chacha20-poly1305? In-Reply-To: (Jussi Kivilinna's message of "Wed, 30 Nov 2016 21:53:57 +0200") References: <87mvgh56re.fsf@wheatstone.g10code.de> Message-ID: <87mvgg2g0p.fsf@wheatstone.g10code.de> On Wed, 30 Nov 2016 20:53, jussi.kivilinna at iki.fi said: > I was thinking of same too. I can do it. Draft mode selection would > happen with new gcry_cipher_open flag, maybe GCRY_CIPHER_POLY1305_DRAFT > or GCRY_CIPHER_POLY1305_OPENSSH. Both make sense - maybe Openssh is the more descriptive one. I don't really care. Stef: Can you help Jussi with testing? Shalom-Salam, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: From stefbon at gmail.com Thu Dec 1 13:09:59 2016 From: stefbon at gmail.com (Stef Bon) Date: Thu, 1 Dec 2016 13:09:59 +0100 Subject: Howto implement chacha20-poly1305? In-Reply-To: <87mvgg2g0p.fsf@wheatstone.g10code.de> References: <87mvgh56re.fsf@wheatstone.g10code.de> <87mvgg2g0p.fsf@wheatstone.g10code.de> Message-ID: 2016-12-01 9:46 GMT+01:00 Werner Koch : > > Both make sense - maybe Openssh is the more descriptive one. I don't > really care. > > Stef: Can you help Jussi with testing? > Sure. I have to get the latest version via git on my system (archlinux) for that. I will look first into this how to do this without breaking dependencies or so. Just let me know when things are ready to test. Thanks again, Stef From smueller at chronox.de Thu Dec 1 17:11:42 2016 From: smueller at chronox.de (Stephan Mueller) Date: Thu, 01 Dec 2016 17:11:42 +0100 Subject: [PATCH 2/3] API for reading the counter in CTR mode In-Reply-To: <1717062.C9GP0V9ulP@positron.chronox.de> References: <1717062.C9GP0V9ulP@positron.chronox.de> Message-ID: <3362765.E979uZx5Qz@positron.chronox.de> The API call allows reading the current counter of the CTR mode. The API remains internal to libgcrypt and is not exported to external callers. Signed-off-by: Stephan Mueller --- cipher/cipher.c | 10 ++++++++++ src/gcrypt-int.h | 2 ++ 2 files changed, 12 insertions(+) diff --git a/cipher/cipher.c b/cipher/cipher.c index ff3340f..55853da 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -1117,6 +1117,16 @@ _gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen) return 0; } +gpg_err_code_t +_gcry_cipher_getctr (gcry_cipher_hd_t hd, void *ctr, size_t ctrlen) +{ + if (ctr && ctrlen == hd->spec->blocksize) + memcpy (ctr, hd->u_ctr.ctr, hd->spec->blocksize); + else + return GPG_ERR_INV_ARG; + + return 0; +} gcry_err_code_t _gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, diff --git a/src/gcrypt-int.h b/src/gcrypt-int.h index 729f54a..ef5337b 100644 --- a/src/gcrypt-int.h +++ b/src/gcrypt-int.h @@ -77,6 +77,8 @@ gpg_err_code_t _gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen); gpg_err_code_t _gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen); +gpg_err_code_t _gcry_cipher_getctr (gcry_cipher_hd_t hd, + void *ctr, size_t ctrlen); size_t _gcry_cipher_get_algo_keylen (int algo); size_t _gcry_cipher_get_algo_blklen (int algo); -- 2.9.3 From smueller at chronox.de Thu Dec 1 17:10:43 2016 From: smueller at chronox.de (Stephan Mueller) Date: Thu, 01 Dec 2016 17:10:43 +0100 Subject: [PATCH 0/3] DRBG: performance improvements Message-ID: <1717062.C9GP0V9ulP@positron.chronox.de> Hi, The attached patches increase the performance of the DRBG significantly. In addition, the DRBG implementation is now en-par with the Linux kernel crypto API DRBG. The changes are fully CAVS tested. Stephan Mueller (3): DRBG: remove stale comment API for reading the counter in CTR mode DRBG: add performance improvements cipher/cipher.c | 10 ++ random/random-drbg.c | 337 ++++++++++++++++++++++++++++++++++----------------- src/gcrypt-int.h | 2 + 3 files changed, 239 insertions(+), 110 deletions(-) -- 2.9.3 From smueller at chronox.de Thu Dec 1 17:11:11 2016 From: smueller at chronox.de (Stephan Mueller) Date: Thu, 01 Dec 2016 17:11:11 +0100 Subject: [PATCH 1/3] DRBG: remove stale comment In-Reply-To: <1717062.C9GP0V9ulP@positron.chronox.de> References: <1717062.C9GP0V9ulP@positron.chronox.de> Message-ID: <1772212.2X0tj64x1m@positron.chronox.de> >From 843aa9f9667c1bcc800599325ebdea8a5e6dcbea Mon Sep 17 00:00:00 2001 From: Stephan Mueller Date: Sun, 27 Nov 2016 10:14:21 +0100 Subject: Remove comment that is not applicable any more. Signed-off-by: Stephan Mueller --- random/random-drbg.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/random/random-drbg.c b/random/random-drbg.c index f9d11a3..9676f0e 100644 --- a/random/random-drbg.c +++ b/random/random-drbg.c @@ -899,8 +899,6 @@ drbg_ctr_update (drbg_state_t drbg, drbg_string_t *addtl, int reseed) memset (df_data, 0, drbg_statelen (drbg)); /* 10.2.1.3.2 step 2 and 10.2.1.4.2 step 2 */ - /* TODO use reseed variable to avoid re-doing DF operation */ - (void) reseed; if (addtl && 0 < addtl->len) { ret = -- 2.9.3 From smueller at chronox.de Thu Dec 1 17:15:10 2016 From: smueller at chronox.de (Stephan Mueller) Date: Thu, 01 Dec 2016 17:15:10 +0100 Subject: [PATCH 3/3] DRBG: add performance improvements In-Reply-To: <1717062.C9GP0V9ulP@positron.chronox.de> References: <1717062.C9GP0V9ulP@positron.chronox.de> Message-ID: <10972849.tkDtSJjrMa@positron.chronox.de> The performance improvements can be categorized as follows: * Initialize the cipher handle of the backend ciphers once and re-use them for subsequent cipher invocations. * Limit the invocation of setkey to the cases when the key is newly created. * Use the AES CTR mode and rip out the counter maintenance in the DRBG code. This allows the use of accelerated CTR AES implementations. To use the CTR AES mode, a NULL buffer is created that is used as the "plaintext" to the CTR mode, because the DRBG CTR AES operation is the result of the encryption of the CTR (i.e. the NULL buffer makes the final XOR of the CTR AES mode a noop). The following timing measurements are made. The measurement do not use a precise timing operation and should rather serve as a general hint to the performance improvements. On a Broadwell i7 CPU: block size 4096 1024 128 32 16 aes256 old 28MB/s 27MB/s 19MB/s 11MB/s 6MB/s aes128 old 29MB/s 32MB/s 23MB/s 15MB/s 9MB/s sha256 old 48MB/s 48MB/s 33MB/s 16MB/s 8MB/s hmac sha256 old 15MB/s 15MB/s 10MB/s 5MB/s 2MB/s aes256 new 180MB/s 169MB/s 93MB/s 37MB/s 20MB/s aes128 new 240MB/s 221MB/s 125MB/s 51MB/s 27MB/s sha256 new 75MB/s 69MB/s 48MB/s 23MB/s 11MB/s hmac sha256 new 37MB/s 34MB/s 21MB/s 8MB/s 4MB/s Signed-off-by: Stephan Mueller --- random/random-drbg.c | 335 ++++++++++++++++++++++++++++++++++----------------- 1 file changed, 227 insertions(+), 108 deletions(-) diff --git a/random/random-drbg.c b/random/random-drbg.c index 9676f0e..dc8e8f3 100644 --- a/random/random-drbg.c +++ b/random/random-drbg.c @@ -289,6 +289,8 @@ struct drbg_state_ops_s gpg_err_code_t (*generate) (drbg_state_t drbg, unsigned char *buf, unsigned int buflen, drbg_string_t *addtl); + gpg_err_code_t (*crypto_init) (drbg_state_t drbg); + void (*crypto_fini) (drbg_state_t drbg); }; struct drbg_test_data_s @@ -309,6 +311,10 @@ struct drbg_state_s * 10.1.1.1 1c) */ unsigned char *scratchpad; /* some memory the DRBG can use for its * operation -- allocated during init */ + void *priv_data; /* Cipher handle */ + gcry_cipher_hd_t ctr_handle; /* CTR mode cipher handle */ +#define DRBG_CTR_NULL_LEN 128 + unsigned char *ctr_null; /* CTR mode zero buffer */ int seeded:1; /* DRBG fully seeded? */ int pr:1; /* Prediction resistance enabled? */ /* Taken from libgcrypt ANSI X9.31 DRNG: We need to keep track of the @@ -363,14 +369,23 @@ static const struct drbg_core_s drbg_cores[] = { {DRBG_CTRAES | DRBG_SYM256, 48, 16, GCRY_CIPHER_AES256} }; -static gpg_err_code_t drbg_sym (drbg_state_t drbg, - const unsigned char *key, - unsigned char *outval, - const drbg_string_t *buf); -static gpg_err_code_t drbg_hmac (drbg_state_t drbg, - const unsigned char *key, +static gpg_err_code_t drbg_hash_init (drbg_state_t drbg); +static gpg_err_code_t drbg_hmac_init (drbg_state_t drbg); +static gpg_err_code_t drbg_hmac_setkey (drbg_state_t drbg, + const unsigned char *key); +static void drbg_hash_fini (drbg_state_t drbg); +static gpg_err_code_t drbg_hash (drbg_state_t drbg, unsigned char *outval, const drbg_string_t *buf); +static gpg_err_code_t drbg_sym_init (drbg_state_t drbg); +static void drbg_sym_fini (drbg_state_t drbg); +static gpg_err_code_t drbg_sym_setkey (drbg_state_t drbg, + const unsigned char *key); +static gpg_err_code_t drbg_sym (drbg_state_t drbg, unsigned char *outval, + const drbg_string_t *buf); +static gpg_err_code_t drbg_sym_ctr (drbg_state_t drbg, + const unsigned char *inbuf, unsigned int inbuflen, + unsigned char *outbuf, unsigned int outbuflen); /****************************************************************** ****************************************************************** @@ -666,6 +681,10 @@ drbg_ctr_bcc (drbg_state_t drbg, /* 10.4.3 step 1 */ memset (out, 0, drbg_blocklen (drbg)); + ret = drbg_sym_setkey(drbg, key); + if (ret) + return ret; + /* 10.4.3 step 2 / 4 */ while (inpos) { @@ -698,7 +717,7 @@ drbg_ctr_bcc (drbg_state_t drbg, } } /* 10.4.3 step 4.2 */ - ret = drbg_sym (drbg, key, out, &data); + ret = drbg_sym (drbg, out, &data); if (ret) return ret; /* 10.4.3 step 2 */ @@ -839,6 +858,9 @@ drbg_ctr_df (drbg_state_t drbg, unsigned char *df_data, /* 10.4.2 step 12: overwriting of outval */ /* 10.4.2 step 13 */ + ret = drbg_sym_setkey(drbg, temp); + if (ret) + goto out; while (generated_len < bytes_to_return) { short blocklen = 0; @@ -846,11 +868,10 @@ drbg_ctr_df (drbg_state_t drbg, unsigned char *df_data, /* the truncation of the key length is implicit as the key * is only drbg_blocklen in size -- check for the implementation * of the cipher function callback */ - ret = drbg_sym (drbg, temp, X, &cipherin); + ret = drbg_sym (drbg, X, &cipherin); if (ret) goto out; - blocklen = (drbg_blocklen (drbg) < - (bytes_to_return - generated_len)) ? + blocklen = (drbg_blocklen (drbg) < (bytes_to_return - generated_len)) ? drbg_blocklen (drbg) : (bytes_to_return - generated_len); /* 10.4.2 step 13.2 and 14 */ memcpy (df_data + generated_len, X, blocklen); @@ -889,54 +910,51 @@ drbg_ctr_update (drbg_state_t drbg, drbg_string_t *addtl, int reseed) unsigned char *temp = drbg->scratchpad; unsigned char *df_data = drbg->scratchpad + drbg_statelen (drbg) + drbg_blocklen (drbg); - unsigned char *temp_p, *df_data_p; /* pointer to iterate over buffers */ - unsigned int len = 0; - drbg_string_t cipherin; unsigned char prefix = DRBG_PREFIX1; memset (temp, 0, drbg_statelen (drbg) + drbg_blocklen (drbg)); if (3 > reseed) memset (df_data, 0, drbg_statelen (drbg)); - /* 10.2.1.3.2 step 2 and 10.2.1.4.2 step 2 */ - if (addtl && 0 < addtl->len) + if (!reseed) { - ret = - drbg_ctr_df (drbg, df_data, drbg_statelen (drbg), addtl); + /* + * The DRBG uses the CTR mode of the underlying AES cipher. The + * CTR mode increments the counter value after the AES operation + * but SP800-90A requires that the counter is incremented before + * the AES operation. Hence, we increment it at the time we set + * it by one. + */ + drbg_add_buf (drbg->V, drbg_blocklen (drbg), &prefix, 1); + + ret = _gcry_cipher_setkey (drbg->ctr_handle, drbg->C, drbg_keylen (drbg)); if (ret) - goto out; + goto out; } - drbg_string_fill (&cipherin, drbg->V, drbg_blocklen (drbg)); - /* 10.2.1.3.2 step 2 and 3 -- are already covered as we memset(0) - * all memory during initialization */ - while (len < (drbg_statelen (drbg))) + /* 10.2.1.3.2 step 2 and 10.2.1.4.2 step 2 */ + if (addtl && 0 < addtl->len) { - /* 10.2.1.2 step 2.1 */ - drbg_add_buf (drbg->V, drbg_blocklen (drbg), &prefix, 1); - /* 10.2.1.2 step 2.2 */ - /* using target of temp + len: 10.2.1.2 step 2.3 and 3 */ - ret = drbg_sym (drbg, drbg->C, temp + len, &cipherin); + ret = + drbg_ctr_df (drbg, df_data, drbg_statelen (drbg), addtl); if (ret) goto out; - /* 10.2.1.2 step 2.3 and 3 */ - len += drbg_blocklen (drbg); } - /* 10.2.1.2 step 4 */ - temp_p = temp; - df_data_p = df_data; - for (len = 0; len < drbg_statelen (drbg); len++) - { - *temp_p ^= *df_data_p; - df_data_p++; - temp_p++; - } + ret = drbg_sym_ctr (drbg, df_data, drbg_statelen(drbg), + temp, drbg_statelen(drbg)); + if (ret) + goto out; /* 10.2.1.2 step 5 */ - memcpy (drbg->C, temp, drbg_keylen (drbg)); + ret = _gcry_cipher_setkey (drbg->ctr_handle, temp, drbg_keylen (drbg)); + if (ret) + goto out; + /* 10.2.1.2 step 6 */ memcpy (drbg->V, temp + drbg_keylen (drbg), drbg_blocklen (drbg)); + /* See above: increment counter by one to compensate timing of CTR op */ + drbg_add_buf (drbg->V, drbg_blocklen (drbg), &prefix, 1); ret = 0; out: @@ -957,9 +975,6 @@ drbg_ctr_generate (drbg_state_t drbg, drbg_string_t *addtl) { gpg_err_code_t ret = 0; - unsigned int len = 0; - drbg_string_t data; - unsigned char prefix = DRBG_PREFIX1; memset (drbg->scratchpad, 0, drbg_blocklen (drbg)); @@ -973,24 +988,9 @@ drbg_ctr_generate (drbg_state_t drbg, } /* 10.2.1.5.2 step 4.1 */ - drbg_add_buf (drbg->V, drbg_blocklen (drbg), &prefix, 1); - drbg_string_fill (&data, drbg->V, drbg_blocklen (drbg)); - while (len < buflen) - { - unsigned int outlen = 0; - /* 10.2.1.5.2 step 4.2 */ - ret = drbg_sym (drbg, drbg->C, drbg->scratchpad, &data); - if (ret) - goto out; - outlen = (drbg_blocklen (drbg) < (buflen - len)) ? - drbg_blocklen (drbg) : (buflen - len); - /* 10.2.1.5.2 step 4.3 */ - memcpy (buf + len, drbg->scratchpad, outlen); - len += outlen; - /* 10.2.1.5.2 step 6 */ - if (len < buflen) - drbg_add_buf (drbg->V, drbg_blocklen (drbg), &prefix, 1); - } + ret = drbg_sym_ctr (drbg, drbg->ctr_null, DRBG_CTR_NULL_LEN, buf, buflen); + if (ret) + goto out; /* 10.2.1.5.2 step 6 */ if (addtl) @@ -998,13 +998,14 @@ drbg_ctr_generate (drbg_state_t drbg, ret = drbg_ctr_update (drbg, addtl, 3); out: - memset (drbg->scratchpad, 0, drbg_blocklen (drbg)); return ret; } static struct drbg_state_ops_s drbg_ctr_ops = { drbg_ctr_update, - drbg_ctr_generate + drbg_ctr_generate, + drbg_sym_init, + drbg_sym_fini, }; /****************************************************************** @@ -1023,6 +1024,9 @@ drbg_hmac_update (drbg_state_t drbg, drbg_string_t *seed, int reseed) /* 10.1.2.3 step 2 already implicitly covered with * the initial memset(0) of drbg->C */ memset (drbg->V, 1, drbg_statelen (drbg)); + ret = drbg_hmac_setkey (drbg, drbg->C); + if (ret) + return ret; } /* build linked list which implements the concatenation and fill @@ -1044,12 +1048,16 @@ drbg_hmac_update (drbg_state_t drbg, drbg_string_t *seed, int reseed) prefix = DRBG_PREFIX1; /* 10.1.2.2 step 1 and 4 -- concatenation and HMAC for key */ seed2.buf = &prefix; - ret = drbg_hmac (drbg, drbg->C, drbg->C, &seed1); + ret = drbg_hash (drbg, drbg->C, &seed1); + if (ret) + return ret; + + ret = drbg_hmac_setkey (drbg, drbg->C); if (ret) return ret; /* 10.1.2.2 step 2 and 5 -- HMAC for V */ - ret = drbg_hmac (drbg, drbg->C, drbg->V, &cipherin); + ret = drbg_hash (drbg, drbg->V, &cipherin); if (ret) return ret; @@ -1083,7 +1091,7 @@ drbg_hmac_generate (drbg_state_t drbg, unsigned char *buf, unsigned int buflen, { unsigned int outlen = 0; /* 10.1.2.5 step 4.1 */ - ret = drbg_hmac (drbg, drbg->C, drbg->V, &data); + ret = drbg_hash (drbg, drbg->V, &data); if (ret) return ret; outlen = (drbg_blocklen (drbg) < (buflen - len)) ? @@ -1104,7 +1112,9 @@ drbg_hmac_generate (drbg_state_t drbg, unsigned char *buf, unsigned int buflen, static struct drbg_state_ops_s drbg_hmac_ops = { drbg_hmac_update, - drbg_hmac_generate + drbg_hmac_generate, + drbg_hmac_init, + drbg_hash_fini, }; /****************************************************************** @@ -1148,7 +1158,7 @@ drbg_hash_df (drbg_state_t drbg, { short blocklen = 0; /* 10.4.1 step 4.1 */ - ret = drbg_hmac (drbg, NULL, tmp, &data1); + ret = drbg_hash (drbg, tmp, &data1); if (ret) goto out; /* 10.4.1 step 4.2 */ @@ -1237,13 +1247,13 @@ drbg_hash_process_addtl (drbg_state_t drbg, drbg_string_t *addtl) data2.next = data3; data3->next = NULL; /* 10.1.1.4 step 2a -- cipher invocation */ - ret = drbg_hmac (drbg, NULL, drbg->scratchpad, &data1); + ret = drbg_hash (drbg, drbg->scratchpad, &data1); if (ret) goto out; /* 10.1.1.4 step 2b */ drbg_add_buf (drbg->V, drbg_statelen (drbg), - drbg->scratchpad, drbg_blocklen (drbg)); + drbg->scratchpad, drbg_blocklen (drbg)); out: memset (drbg->scratchpad, 0, drbg_blocklen (drbg)); @@ -1276,7 +1286,7 @@ drbg_hash_hashgen (drbg_state_t drbg, { unsigned int outlen = 0; /* 10.1.1.4 step hashgen 4.1 */ - ret = drbg_hmac (drbg, NULL, dst, &data); + ret = drbg_hash (drbg, dst, &data); if (ret) goto out; outlen = (drbg_blocklen (drbg) < (buflen - len)) ? @@ -1330,7 +1340,7 @@ drbg_hash_generate (drbg_state_t drbg, drbg_string_fill (&data1, &prefix, 1); drbg_string_fill (&data2, drbg->V, drbg_statelen (drbg)); data1.next = &data2; - ret = drbg_hmac (drbg, NULL, drbg->scratchpad, &data1); + ret = drbg_hash (drbg, drbg->scratchpad, &data1); if (ret) goto out; @@ -1354,7 +1364,9 @@ drbg_hash_generate (drbg_state_t drbg, */ static struct drbg_state_ops_s drbg_hash_ops = { drbg_hash_update, - drbg_hash_generate + drbg_hash_generate, + drbg_hash_init, + drbg_hash_fini, }; /****************************************************************** @@ -1599,6 +1611,7 @@ drbg_uninstantiate (drbg_state_t drbg) { if (!drbg) return GPG_ERR_INV_ARG; + drbg->d_ops->crypto_fini(drbg); xfree (drbg->V); drbg->V = NULL; xfree (drbg->C); @@ -1666,13 +1679,16 @@ drbg_instantiate (drbg_state_t drbg, /* 9.1 step 4 is implicit in drbg_sec_strength */ - /* no allocation of drbg as this is done by the kernel crypto API */ + ret = drbg->d_ops->crypto_init(drbg); + if (ret) + goto err; + drbg->V = xcalloc_secure (1, drbg_statelen (drbg)); if (!drbg->V) - goto err; + goto fini; drbg->C = xcalloc_secure (1, drbg_statelen (drbg)); if (!drbg->C) - goto err; + goto fini; /* scratchpad is only generated for CTR and Hash */ if (drbg->core->flags & DRBG_HMAC) sb_size = 0; @@ -1689,19 +1705,21 @@ drbg_instantiate (drbg_state_t drbg, { drbg->scratchpad = xcalloc_secure (1, sb_size); if (!drbg->scratchpad) - goto err; + goto fini; } dbg (("DRBG: state allocated with scratchpad size %u bytes\n", sb_size)); /* 9.1 step 6 through 11 */ ret = drbg_seed (drbg, pers, 0); if (ret) - goto err; + goto fini; dbg (("DRBG: core %d %s prediction resistance successfully initialized\n", coreref, pr ? "with" : "without")); return 0; + fini: + drbg->d_ops->crypto_fini(drbg); err: drbg_uninstantiate (drbg); return ret; @@ -2563,59 +2581,160 @@ _gcry_rngdrbg_selftest (selftest_report_func_t report) ***************************************************************/ static gpg_err_code_t -drbg_hmac (drbg_state_t drbg, const unsigned char *key, - unsigned char *outval, const drbg_string_t *buf) +drbg_hash_init (drbg_state_t drbg) { + gcry_md_hd_t hd; gpg_error_t err; + + err = _gcry_md_open (&hd, drbg->core->backend_cipher, 0); + if (err) + return err; + + drbg->priv_data = hd; + + return 0; +} + +static gpg_err_code_t +drbg_hmac_init (drbg_state_t drbg) +{ gcry_md_hd_t hd; + gpg_error_t err; - if (key) - { - err = - _gcry_md_open (&hd, drbg->core->backend_cipher, GCRY_MD_FLAG_HMAC); - if (err) - return err; - err = _gcry_md_setkey (hd, key, drbg_statelen (drbg)); - if (err) - return err; - } - else - { - err = _gcry_md_open (&hd, drbg->core->backend_cipher, 0); - if (err) - return err; - } + err = _gcry_md_open (&hd, drbg->core->backend_cipher, GCRY_MD_FLAG_HMAC); + if (err) + return err; + + drbg->priv_data = hd; + + return 0; +} + +static gpg_err_code_t +drbg_hmac_setkey (drbg_state_t drbg, const unsigned char *key) +{ + gcry_md_hd_t hd = (gcry_md_hd_t)drbg->priv_data; + + return _gcry_md_setkey (hd, key, drbg_statelen (drbg)); +} + +static void +drbg_hash_fini (drbg_state_t drbg) +{ + gcry_md_hd_t hd = (gcry_md_hd_t)drbg->priv_data; + + _gcry_md_close (hd); +} + +static gpg_err_code_t +drbg_hash (drbg_state_t drbg, unsigned char *outval, const drbg_string_t *buf) +{ + gcry_md_hd_t hd = (gcry_md_hd_t)drbg->priv_data; + + _gcry_md_reset(hd); for (; NULL != buf; buf = buf->next) _gcry_md_write (hd, buf->buf, buf->len); _gcry_md_final (hd); memcpy (outval, _gcry_md_read (hd, drbg->core->backend_cipher), drbg_blocklen (drbg)); - _gcry_md_close (hd); return 0; } +static void +drbg_sym_fini (drbg_state_t drbg) +{ + gcry_cipher_hd_t hd = (gcry_cipher_hd_t)drbg->priv_data; + + if (hd) + _gcry_cipher_close (hd); + if (drbg->ctr_handle) + _gcry_cipher_close (drbg->ctr_handle); + if (drbg->ctr_null) + free(drbg->ctr_null); +} + static gpg_err_code_t -drbg_sym (drbg_state_t drbg, const unsigned char *key, - unsigned char *outval, const drbg_string_t *buf) +drbg_sym_init (drbg_state_t drbg) { - gpg_error_t err; gcry_cipher_hd_t hd; + gpg_error_t err; + + drbg->ctr_null = calloc(1, DRBG_CTR_NULL_LEN); + if (!drbg->ctr_null) + return GPG_ERR_ENOMEM; err = _gcry_cipher_open (&hd, drbg->core->backend_cipher, - GCRY_CIPHER_MODE_ECB, 0); + GCRY_CIPHER_MODE_ECB, 0); if (err) - return err; + { + drbg_sym_fini (drbg); + return err; + } + drbg->priv_data = hd; + + err = _gcry_cipher_open (&drbg->ctr_handle, drbg->core->backend_cipher, + GCRY_CIPHER_MODE_CTR, 0); + if (err) + { + drbg_sym_fini (drbg); + return err; + } + + if (drbg_blocklen (drbg) != _gcry_cipher_get_algo_blklen (drbg->core->backend_cipher)) - return -GPG_ERR_NO_ERROR; + { + drbg_sym_fini (drbg); + return -GPG_ERR_NO_ERROR; + } + + return 0; +} + +static gpg_err_code_t +drbg_sym_setkey (drbg_state_t drbg, const unsigned char *key) +{ + gcry_cipher_hd_t hd = (gcry_cipher_hd_t)drbg->priv_data; + + return _gcry_cipher_setkey (hd, key, drbg_keylen (drbg)); +} + +static gpg_err_code_t +drbg_sym (drbg_state_t drbg, unsigned char *outval, const drbg_string_t *buf) +{ + gcry_cipher_hd_t hd = (gcry_cipher_hd_t)drbg->priv_data; + + _gcry_cipher_reset(hd); if (drbg_blocklen (drbg) < buf->len) return -GPG_ERR_NO_ERROR; - err = _gcry_cipher_setkey (hd, key, drbg_keylen (drbg)); + /* in is only component */ + return _gcry_cipher_encrypt (hd, outval, drbg_blocklen (drbg), buf->buf, + buf->len); +} + +static gpg_err_code_t +drbg_sym_ctr (drbg_state_t drbg, + const unsigned char *inbuf, unsigned int inbuflen, + unsigned char *outbuf, unsigned int outbuflen) +{ + gpg_error_t err; + + _gcry_cipher_reset(drbg->ctr_handle); + err = _gcry_cipher_setctr(drbg->ctr_handle, drbg->V, drbg_blocklen (drbg)); if (err) return err; - /* in is only component */ - _gcry_cipher_encrypt (hd, outval, drbg_blocklen (drbg), buf->buf, - buf->len); - _gcry_cipher_close (hd); - return 0; + + while (outbuflen) + { + unsigned int cryptlen = (inbuflen > outbuflen) ? outbuflen : inbuflen; + + err = _gcry_cipher_encrypt (drbg->ctr_handle, outbuf, cryptlen, inbuf, + cryptlen); + if (err) + return err; + + outbuflen -= cryptlen; + outbuf += cryptlen; + } + return _gcry_cipher_getctr(drbg->ctr_handle, drbg->V, drbg_blocklen (drbg)); } -- 2.9.3 From smueller at chronox.de Sat Dec 3 19:18:01 2016 From: smueller at chronox.de (Stephan Mueller) Date: Sat, 03 Dec 2016 19:18:01 +0100 Subject: [PATCH] DRBG: eliminate unneeded memcpy invocations Message-ID: <1876399.Mazi4n7g1N@positron.chronox.de> Hi, This patch goes on top of the patch set I sent 2 days ago. ---8<--- The gcry_md_read returns a pointer to the hash which can be directly used instead of copying it into a scratch buffer. This eliminates a number of memcpy invocations for HMAC and Hash DRBG and reduces the memory footprint of the Hash DRBG by the block size of the used hash. The performance increase is between 1 and 3 MB/s depending on the output buffer size. Signed-off-by: Stephan Mueller --- random/random-drbg.c | 114 +++++++++++++++------------------------------------ 1 file changed, 34 insertions(+), 80 deletions(-) diff --git a/random/random-drbg.c b/random/random-drbg.c index dc8e8f3..e2fe861 100644 --- a/random/random-drbg.c +++ b/random/random-drbg.c @@ -374,9 +374,7 @@ static gpg_err_code_t drbg_hmac_init (drbg_state_t drbg); static gpg_err_code_t drbg_hmac_setkey (drbg_state_t drbg, const unsigned char *key); static void drbg_hash_fini (drbg_state_t drbg); -static gpg_err_code_t drbg_hash (drbg_state_t drbg, - unsigned char *outval, - const drbg_string_t *buf); +static byte *drbg_hash (drbg_state_t drbg, const drbg_string_t *buf); static gpg_err_code_t drbg_sym_init (drbg_state_t drbg); static void drbg_sym_fini (drbg_state_t drbg); static gpg_err_code_t drbg_sym_setkey (drbg_state_t drbg, @@ -1042,24 +1040,21 @@ drbg_hmac_update (drbg_state_t drbg, drbg_string_t *seed, int reseed) /* we execute two rounds of V/K massaging */ for (i = 2; 0 < i; i--) { + byte *retval; /* first round uses 0x0, second 0x1 */ unsigned char prefix = DRBG_PREFIX0; if (1 == i) prefix = DRBG_PREFIX1; /* 10.1.2.2 step 1 and 4 -- concatenation and HMAC for key */ seed2.buf = &prefix; - ret = drbg_hash (drbg, drbg->C, &seed1); - if (ret) - return ret; - - ret = drbg_hmac_setkey (drbg, drbg->C); + retval = drbg_hash (drbg, &seed1); + ret = drbg_hmac_setkey (drbg, retval); if (ret) return ret; /* 10.1.2.2 step 2 and 5 -- HMAC for V */ - ret = drbg_hash (drbg, drbg->V, &cipherin); - if (ret) - return ret; + retval = drbg_hash (drbg, &cipherin); + memcpy(drbg->V, retval, drbg_blocklen (drbg)); /* 10.1.2.2 step 3 */ if (!seed || 0 == seed->len) @@ -1091,9 +1086,8 @@ drbg_hmac_generate (drbg_state_t drbg, unsigned char *buf, unsigned int buflen, { unsigned int outlen = 0; /* 10.1.2.5 step 4.1 */ - ret = drbg_hash (drbg, drbg->V, &data); - if (ret) - return ret; + byte *retval = drbg_hash (drbg, &data); + memcpy(drbg->V, retval, drbg_blocklen (drbg)); outlen = (drbg_blocklen (drbg) < (buflen - len)) ? drbg_blocklen (drbg) : (buflen - len); @@ -1137,14 +1131,10 @@ drbg_hash_df (drbg_state_t drbg, unsigned char *outval, size_t outlen, drbg_string_t *entropy) { - gpg_err_code_t ret = 0; size_t len = 0; unsigned char input[5]; - unsigned char *tmp = drbg->scratchpad + drbg_statelen (drbg); drbg_string_t data1; - memset (tmp, 0, drbg_blocklen (drbg)); - /* 10.4.1 step 3 */ input[0] = 1; drbg_cpu_to_be32 ((outlen * 8), &input[1]); @@ -1158,20 +1148,16 @@ drbg_hash_df (drbg_state_t drbg, { short blocklen = 0; /* 10.4.1 step 4.1 */ - ret = drbg_hash (drbg, tmp, &data1); - if (ret) - goto out; + byte *retval = drbg_hash (drbg, &data1); /* 10.4.1 step 4.2 */ input[0]++; blocklen = (drbg_blocklen (drbg) < (outlen - len)) ? drbg_blocklen (drbg) : (outlen - len); - memcpy (outval + len, tmp, blocklen); + memcpy (outval + len, retval, blocklen); len += blocklen; } - out: - memset (tmp, 0, drbg_blocklen (drbg)); - return ret; + return 0; } /* update function for Hash DRBG as defined in 10.1.1.2 / 10.1.1.3 */ @@ -1227,13 +1213,10 @@ drbg_hash_update (drbg_state_t drbg, drbg_string_t *seed, int reseed) static gpg_err_code_t drbg_hash_process_addtl (drbg_state_t drbg, drbg_string_t *addtl) { - gpg_err_code_t ret = 0; drbg_string_t data1, data2; drbg_string_t *data3; unsigned char prefix = DRBG_PREFIX2; - - /* this is value w as per documentation */ - memset (drbg->scratchpad, 0, drbg_blocklen (drbg)); + byte *retval; /* 10.1.1.4 step 2 */ if (!addtl || 0 == addtl->len) @@ -1247,37 +1230,25 @@ drbg_hash_process_addtl (drbg_state_t drbg, drbg_string_t *addtl) data2.next = data3; data3->next = NULL; /* 10.1.1.4 step 2a -- cipher invocation */ - ret = drbg_hash (drbg, drbg->scratchpad, &data1); - if (ret) - goto out; + retval = drbg_hash (drbg, &data1); /* 10.1.1.4 step 2b */ - drbg_add_buf (drbg->V, drbg_statelen (drbg), - drbg->scratchpad, drbg_blocklen (drbg)); + drbg_add_buf (drbg->V, drbg_statelen (drbg), retval, drbg_blocklen (drbg)); - out: - memset (drbg->scratchpad, 0, drbg_blocklen (drbg)); - return ret; + return 0; } /* * Hashgen defined in 10.1.1.4 */ static gpg_err_code_t -drbg_hash_hashgen (drbg_state_t drbg, - unsigned char *buf, unsigned int buflen) +drbg_hash_hashgen (drbg_state_t drbg, unsigned char *buf, unsigned int buflen) { - gpg_err_code_t ret = 0; unsigned int len = 0; unsigned char *src = drbg->scratchpad; - unsigned char *dst = drbg->scratchpad + drbg_statelen (drbg); drbg_string_t data; unsigned char prefix = DRBG_PREFIX1; - /* use the scratchpad as a lookaside buffer */ - memset (src, 0, drbg_statelen (drbg)); - memset (dst, 0, drbg_blocklen (drbg)); - /* 10.1.1.4 step hashgen 2 */ memcpy (src, drbg->V, drbg_statelen (drbg)); @@ -1286,44 +1257,36 @@ drbg_hash_hashgen (drbg_state_t drbg, { unsigned int outlen = 0; /* 10.1.1.4 step hashgen 4.1 */ - ret = drbg_hash (drbg, dst, &data); - if (ret) - goto out; + byte *retval = drbg_hash (drbg, &data); outlen = (drbg_blocklen (drbg) < (buflen - len)) ? drbg_blocklen (drbg) : (buflen - len); /* 10.1.1.4 step hashgen 4.2 */ - memcpy (buf + len, dst, outlen); + memcpy (buf + len, retval, outlen); len += outlen; /* 10.1.1.4 hashgen step 4.3 */ if (len < buflen) drbg_add_buf (src, drbg_statelen (drbg), &prefix, 1); } - out: - memset (drbg->scratchpad, 0, - (drbg_statelen (drbg) + drbg_blocklen (drbg))); - return ret; + memset (drbg->scratchpad, 0, drbg_statelen (drbg)); + return 0; } /* Generate function for Hash DRBG as defined in 10.1.1.4 */ static gpg_err_code_t -drbg_hash_generate (drbg_state_t drbg, - unsigned char *buf, unsigned int buflen, - drbg_string_t *addtl) +drbg_hash_generate (drbg_state_t drbg, unsigned char *buf, unsigned int buflen, + drbg_string_t *addtl) { - gpg_err_code_t ret = 0; + gpg_err_code_t ret; unsigned char prefix = DRBG_PREFIX3; drbg_string_t data1, data2; + byte *retval; union { unsigned char req[8]; u64 req_int; } u; - /* - * scratchpad usage: drbg_hash_process_addtl uses the scratchpad, but - * fully completes before returning. Thus, we can reuse the scratchpad - */ /* 10.1.1.4 step 2 */ ret = drbg_hash_process_addtl (drbg, addtl); if (ret) @@ -1334,27 +1297,20 @@ drbg_hash_generate (drbg_state_t drbg, if (ret) return ret; - /* this is the value H as documented in 10.1.1.4 */ - memset (drbg->scratchpad, 0, drbg_blocklen (drbg)); /* 10.1.1.4 step 4 */ drbg_string_fill (&data1, &prefix, 1); drbg_string_fill (&data2, drbg->V, drbg_statelen (drbg)); data1.next = &data2; - ret = drbg_hash (drbg, drbg->scratchpad, &data1); - if (ret) - goto out; + + /* this is the value H as documented in 10.1.1.4 */ + retval = drbg_hash (drbg, &data1); /* 10.1.1.4 step 5 */ - drbg_add_buf (drbg->V, drbg_statelen (drbg), - drbg->scratchpad, drbg_blocklen (drbg)); - drbg_add_buf (drbg->V, drbg_statelen (drbg), drbg->C, - drbg_statelen (drbg)); + drbg_add_buf (drbg->V, drbg_statelen (drbg), retval, drbg_blocklen (drbg)); + drbg_add_buf (drbg->V, drbg_statelen (drbg), drbg->C, drbg_statelen (drbg)); u.req_int = be_bswap64 (drbg->reseed_ctr); - drbg_add_buf (drbg->V, drbg_statelen (drbg), u.req, - sizeof (u.req)); + drbg_add_buf (drbg->V, drbg_statelen (drbg), u.req, sizeof (u.req)); - out: - memset (drbg->scratchpad, 0, drbg_blocklen (drbg)); return ret; } @@ -1699,7 +1655,7 @@ drbg_instantiate (drbg_state_t drbg, drbg_blocklen (drbg) + /* iv */ drbg_statelen (drbg) + drbg_blocklen (drbg); /* temp */ else - sb_size = drbg_statelen (drbg) + drbg_blocklen (drbg); + sb_size = drbg_statelen (drbg); if (0 < sb_size) { @@ -2626,8 +2582,8 @@ drbg_hash_fini (drbg_state_t drbg) _gcry_md_close (hd); } -static gpg_err_code_t -drbg_hash (drbg_state_t drbg, unsigned char *outval, const drbg_string_t *buf) +static byte * +drbg_hash (drbg_state_t drbg, const drbg_string_t *buf) { gcry_md_hd_t hd = (gcry_md_hd_t)drbg->priv_data; @@ -2635,9 +2591,7 @@ drbg_hash (drbg_state_t drbg, unsigned char *outval, const drbg_string_t *buf) for (; NULL != buf; buf = buf->next) _gcry_md_write (hd, buf->buf, buf->len); _gcry_md_final (hd); - memcpy (outval, _gcry_md_read (hd, drbg->core->backend_cipher), - drbg_blocklen (drbg)); - return 0; + return _gcry_md_read (hd, drbg->core->backend_cipher); } static void -- 2.9.3 From jussi.kivilinna at iki.fi Sun Dec 4 11:03:28 2016 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Sun, 4 Dec 2016 12:03:28 +0200 Subject: Howto implement chacha20-poly1305? In-Reply-To: <87mvgg2g0p.fsf@wheatstone.g10code.de> References: <87mvgh56re.fsf@wheatstone.g10code.de> <87mvgg2g0p.fsf@wheatstone.g10code.de> Message-ID: <4d2f55cc-910e-bdd4-0505-c4a5f7c3ed3d@iki.fi> On 01.12.2016 10:46, Werner Koch wrote: > On Wed, 30 Nov 2016 20:53, jussi.kivilinna at iki.fi said: > >> I was thinking of same too. I can do it. Draft mode selection would >> happen with new gcry_cipher_open flag, maybe GCRY_CIPHER_POLY1305_DRAFT >> or GCRY_CIPHER_POLY1305_OPENSSH. > > Both make sense - maybe Openssh is the more descriptive one. I don't > really care. This ended up being more complicated than I first thought. I looked in to implementation of chacha20-poly1305 at openssh.com in OpenSSH [1] and it clearly was not the 'draft' AEAD after all. Then I reread the spec [2] which says: 'The construction used is based on that proposed for TLS by Adam Langley in ..., but differs in the layout of data passed to the MAC and in the addition of encyption of the packet lengths.' So, it's different in somewhat complicated way with its 'encrypt AAD' which cannot be easily done with libgcrypt AEAD API. One way could be to handle AAD encryption with separate chacha20 cipher handle. But then one needs to use multiple handles to combine AEAD and encrypt AAD parts and might as well do the whole construction with two chacha20 handles and one poly1305 handle. Also, I could not find test-vectors for this mode. > > Stef: Can you help Jussi with testing? > I modified OpenSSH-7.3p1 to use libgcrypt (1.7) for 'chacha20-poly1305 at openssh.com' to give you example implementation. Commit for this change can found here: https://github.com/jkivilin/openssh-portable/commit/dd4d06bb47cbbbe3607b9be30f17f1495adbeb12 Does this help you? -Jussi [1] http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/ssh/cipher-chachapoly.c?rev=1.8&content-type=text/x-cvsweb-markup [2] http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/ssh/PROTOCOL.chacha20poly1305?rev=1.3&content-type=text/x-cvsweb-markup -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: OpenPGP digital signature URL: From stefbon at gmail.com Sun Dec 4 13:29:22 2016 From: stefbon at gmail.com (Stef Bon) Date: Sun, 4 Dec 2016 13:29:22 +0100 Subject: Howto implement chacha20-poly1305? In-Reply-To: <4d2f55cc-910e-bdd4-0505-c4a5f7c3ed3d@iki.fi> References: <87mvgh56re.fsf@wheatstone.g10code.de> <87mvgg2g0p.fsf@wheatstone.g10code.de> <4d2f55cc-910e-bdd4-0505-c4a5f7c3ed3d@iki.fi> Message-ID: 2016-12-04 11:03 GMT+01:00 Jussi Kivilinna : > > This ended up being more complicated than I first thought. I looked in to implementation of chacha20-poly1305 at openssh.com in OpenSSH [1] and it clearly was not the 'draft' AEAD after all. Then I reread the spec [2] which says: > 'The construction used is based on that proposed for TLS by Adam Langley in ..., > but differs in the layout of data passed to the MAC and in the addition of > encyption of the packet lengths.' > > So, it's different in somewhat complicated way with its 'encrypt AAD' which cannot be easily done with libgcrypt AEAD API. One way could be to handle AAD encryption with separate chacha20 cipher handle. But then one needs to use multiple handles to combine AEAD and encrypt AAD parts and might as well do the whole construction with two chacha20 handles and one poly1305 handle. Also, I could not find test-vectors for this mode. > > I modified OpenSSH-7.3p1 to use libgcrypt (1.7) for 'chacha20-poly1305 at openssh.com' to give you example implementation. Commit for this change can found here: > https://github.com/jkivilin/openssh-portable/commit/dd4d06bb47cbbbe3607b9be30f17f1495adbeb12 > > Does this help you? > Great. I will look at this tomorrow. Report to you back when some result. Stef From jussi.kivilinna at iki.fi Mon Dec 5 15:14:11 2016 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 05 Dec 2016 16:14:11 +0200 Subject: [PATCH] random-drbg: use bufhelp function for big-endian store Message-ID: <148094725175.3391.15004688817915407064.stgit@localhost6.localdomain6> * random/random-drbg.c (drbg_cpu_to_be32): Use 'buf_put_be32' instead of 'be_bswap32'. -- Signed-off-by: Jussi Kivilinna --- random/random-drbg.c | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/random/random-drbg.c b/random/random-drbg.c index f9d11a3..535c446 100644 --- a/random/random-drbg.c +++ b/random/random-drbg.c @@ -155,7 +155,7 @@ #include "g10lib.h" #include "random.h" #include "rand-internal.h" -#include "../cipher/bithelp.h" +#include "../cipher/bufhelp.h" @@ -544,14 +544,7 @@ drbg_sec_strength (u32 flags) static inline void drbg_cpu_to_be32 (u32 val, unsigned char *buf) { - /* FIXME: This may raise a bus error. */ - struct s - { - u32 conv; - }; - struct s *conversion = (struct s *) buf; - - conversion->conv = be_bswap32 (val); + buf_put_be32 (buf, val); } static void From jussi.kivilinna at iki.fi Mon Dec 5 15:14:24 2016 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 05 Dec 2016 16:14:24 +0200 Subject: [PATCH 1/3] OCB: remove 'int64_t' usage Message-ID: <148094726427.3477.8483024518816386622.stgit@localhost6.localdomain6> * cipher/cipher-ocb.c (double_block): Use alternative way to generate sign-bit mask, without 'int64_t'. -- Signed-off-by: Jussi Kivilinna --- cipher/cipher-ocb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/cipher/cipher-ocb.c b/cipher/cipher-ocb.c index 92260d2..d1f01d5 100644 --- a/cipher/cipher-ocb.c +++ b/cipher/cipher-ocb.c @@ -66,7 +66,7 @@ double_block (unsigned char *b) l = buf_get_be64 (b); r = buf_get_be64 (b + 8); - l_0 = (int64_t)l >> 63; + l_0 = -(l >> 63); l = (l + l) ^ (r >> 63); r = (r + r) ^ (l_0 & 135); From jussi.kivilinna at iki.fi Mon Dec 5 15:14:34 2016 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 05 Dec 2016 16:14:34 +0200 Subject: [PATCH 3/3] OCB ARM CE: Move ocb_get_l handling to assembly part In-Reply-To: <148094726427.3477.8483024518816386622.stgit@localhost6.localdomain6> References: <148094726427.3477.8483024518816386622.stgit@localhost6.localdomain6> Message-ID: <148094727432.3477.15817379913793935632.stgit@localhost6.localdomain6> * cipher/rijndael-armv8-aarch32-ce.S: Add OCB 'L_{ntz(i)}' calculation. * cipher/rijndael-armv8-aarch64-ce.S: Ditto. * cipher/rijndael-armv8-ce.c (_gcry_aes_ocb_enc_armv8_ce) (_gcry_aes_ocb_dec_armv8_ce, _gcry_aes_ocb_auth_armv8_ce) (ocb_cryt_fn_t): Updated arguments. (_gcry_aes_armv8_ce_ocb_crypt, _gcry_aes_armv8_ce_ocb_auth): Remove 'ocb_get_l' handling and splitting input to 32 block chunks, instead pass full buffers to assembly. -- Performance on Cortex-A53 (AArch32): Before: AES | nanosecs/byte mebibytes/sec cycles/byte OCB enc | 1.63 ns/B 583.8 MiB/s 1.88 c/B OCB dec | 1.67 ns/B 572.1 MiB/s 1.92 c/B OCB auth | 1.33 ns/B 717.1 MiB/s 1.53 c/B After (~12% faster): AES | nanosecs/byte mebibytes/sec cycles/byte OCB enc | 1.47 ns/B 650.2 MiB/s 1.69 c/B OCB dec | 1.48 ns/B 644.5 MiB/s 1.70 c/B OCB auth | 1.19 ns/B 798.2 MiB/s 1.38 c/B Performance on Cortex-A53 (AArch64): Before: AES | nanosecs/byte mebibytes/sec cycles/byte OCB enc | 1.29 ns/B 738.5 MiB/s 1.49 c/B OCB dec | 1.32 ns/B 723.5 MiB/s 1.52 c/B OCB auth | 1.15 ns/B 827.0 MiB/s 1.33 c/B After (~8% faster): AES | nanosecs/byte mebibytes/sec cycles/byte OCB enc | 1.21 ns/B 789.1 MiB/s 1.39 c/B OCB dec | 1.21 ns/B 789.2 MiB/s 1.39 c/B OCB auth | 1.10 ns/B 867.0 MiB/s 1.27 c/B Signed-off-by: Jussi Kivilinna --- cipher/rijndael-armv8-aarch32-ce.S | 98 ++++++++++++++++++++++++--- cipher/rijndael-armv8-aarch64-ce.S | 125 ++++++++++++++++++++++++----------- cipher/rijndael-armv8-ce.c | 129 ++++-------------------------------- 3 files changed, 188 insertions(+), 164 deletions(-) diff --git a/cipher/rijndael-armv8-aarch32-ce.S b/cipher/rijndael-armv8-aarch32-ce.S index bf68f20..f375f67 100644 --- a/cipher/rijndael-armv8-aarch32-ce.S +++ b/cipher/rijndael-armv8-aarch32-ce.S @@ -1021,9 +1021,10 @@ _gcry_aes_ctr_enc_armv8_ce: * const unsigned char *inbuf, * unsigned char *offset, * unsigned char *checksum, - * void **Ls, + * unsigned char *L_table, * size_t nblocks, - * unsigned int nrounds); + * unsigned int nrounds, + * unsigned int blkn); */ .align 3 @@ -1039,6 +1040,7 @@ _gcry_aes_ocb_enc_armv8_ce: * %st+4: Ls => r5 * %st+8: nblocks => r6 (0 < nblocks <= 32) * %st+12: nrounds => r7 + * %st+16: blkn => lr */ vpush {q4-q7} @@ -1047,6 +1049,7 @@ _gcry_aes_ocb_enc_armv8_ce: ldr r4, [sp, #(104+0)] ldr r5, [sp, #(104+4)] ldr r6, [sp, #(104+8)] + ldr lr, [sp, #(104+16)] cmp r7, #12 vld1.8 {q0}, [r3] /* load offset */ @@ -1059,6 +1062,7 @@ _gcry_aes_ocb_enc_armv8_ce: #define OCB_ENC(bits, ...) \ .Locb_enc_entry_##bits: \ cmp r6, #4; \ + add lr, #1; \ blo .Locb_enc_loop_##bits; \ \ .Locb_enc_loop4_##bits: \ @@ -1067,7 +1071,23 @@ _gcry_aes_ocb_enc_armv8_ce: /* Checksum_i = Checksum_{i-1} xor P_i */ \ /* C_i = Offset_i xor ENCIPHER(K, P_i xor Offset_i) */ \ \ - ldm r5!, {r8, r9, r10, r11}; \ + add r9, lr, #1; \ + add r10, lr, #2; \ + add r11, lr, #3; \ + rbit r8, lr; \ + add lr, lr, #4; \ + rbit r9, r9; \ + rbit r10, r10; \ + rbit r11, r11; \ + clz r8, r8; /* ntz(i+0) */ \ + clz r9, r9; /* ntz(i+1) */ \ + clz r10, r10; /* ntz(i+2) */ \ + clz r11, r11; /* ntz(i+3) */ \ + add r8, r5, r8, lsl #4; \ + add r9, r5, r9, lsl #4; \ + add r10, r5, r10, lsl #4; \ + add r11, r5, r11, lsl #4; \ + \ sub r6, #4; \ \ vld1.8 {q9}, [r8]; /* load L_{ntz(i+0)} */ \ @@ -1120,7 +1140,11 @@ _gcry_aes_ocb_enc_armv8_ce: /* Checksum_i = Checksum_{i-1} xor P_i */ \ /* C_i = Offset_i xor ENCIPHER(K, P_i xor Offset_i) */ \ \ - ldr r8, [r5], #4; \ + rbit r8, lr; \ + add lr, #1; \ + clz r8, r8; /* ntz(i) */ \ + add r8, r5, r8, lsl #4; \ + \ vld1.8 {q1}, [r2]!; /* load plaintext */ \ vld1.8 {q2}, [r8]; /* load L_{ntz(i)} */ \ vld1.8 {q3}, [r4]; /* load checksum */ \ @@ -1171,9 +1195,10 @@ _gcry_aes_ocb_enc_armv8_ce: * const unsigned char *inbuf, * unsigned char *offset, * unsigned char *checksum, - * void **Ls, + * unsigned char *L_table, * size_t nblocks, - * unsigned int nrounds); + * unsigned int nrounds, + * unsigned int blkn); */ .align 3 @@ -1189,6 +1214,7 @@ _gcry_aes_ocb_dec_armv8_ce: * %st+4: Ls => r5 * %st+8: nblocks => r6 (0 < nblocks <= 32) * %st+12: nrounds => r7 + * %st+16: blkn => lr */ vpush {q4-q7} @@ -1197,6 +1223,7 @@ _gcry_aes_ocb_dec_armv8_ce: ldr r4, [sp, #(104+0)] ldr r5, [sp, #(104+4)] ldr r6, [sp, #(104+8)] + ldr lr, [sp, #(104+16)] cmp r7, #12 vld1.8 {q0}, [r3] /* load offset */ @@ -1209,6 +1236,7 @@ _gcry_aes_ocb_dec_armv8_ce: #define OCB_DEC(bits, ...) \ .Locb_dec_entry_##bits: \ cmp r6, #4; \ + add lr, #1; \ blo .Locb_dec_loop_##bits; \ \ .Locb_dec_loop4_##bits: \ @@ -1217,7 +1245,23 @@ _gcry_aes_ocb_dec_armv8_ce: /* P_i = Offset_i xor DECIPHER(K, C_i xor Offset_i) */ \ /* Checksum_i = Checksum_{i-1} xor P_i */ \ \ - ldm r5!, {r8, r9, r10, r11}; \ + add r9, lr, #1; \ + add r10, lr, #2; \ + add r11, lr, #3; \ + rbit r8, lr; \ + add lr, lr, #4; \ + rbit r9, r9; \ + rbit r10, r10; \ + rbit r11, r11; \ + clz r8, r8; /* ntz(i+0) */ \ + clz r9, r9; /* ntz(i+1) */ \ + clz r10, r10; /* ntz(i+2) */ \ + clz r11, r11; /* ntz(i+3) */ \ + add r8, r5, r8, lsl #4; \ + add r9, r5, r9, lsl #4; \ + add r10, r5, r10, lsl #4; \ + add r11, r5, r11, lsl #4; \ + \ sub r6, #4; \ \ vld1.8 {q9}, [r8]; /* load L_{ntz(i+0)} */ \ @@ -1270,7 +1314,11 @@ _gcry_aes_ocb_dec_armv8_ce: /* P_i = Offset_i xor DECIPHER(K, C_i xor Offset_i) */ \ /* Checksum_i = Checksum_{i-1} xor P_i */ \ \ - ldr r8, [r5], #4; \ + rbit r8, lr; \ + add lr, #1; \ + clz r8, r8; /* ntz(i) */ \ + add r8, r5, r8, lsl #4; \ + \ vld1.8 {q2}, [r8]; /* load L_{ntz(i)} */ \ vld1.8 {q1}, [r2]!; /* load ciphertext */ \ subs r6, #1; \ @@ -1320,9 +1368,10 @@ _gcry_aes_ocb_dec_armv8_ce: * const unsigned char *abuf, * unsigned char *offset, * unsigned char *checksum, - * void **Ls, + * unsigned char *L_table, * size_t nblocks, - * unsigned int nrounds); + * unsigned int nrounds, + * unsigned int blkn); */ .align 3 @@ -1337,6 +1386,7 @@ _gcry_aes_ocb_auth_armv8_ce: * %st+0: Ls => r5 * %st+4: nblocks => r6 (0 < nblocks <= 32) * %st+8: nrounds => r7 + * %st+12: blkn => lr */ vpush {q4-q7} @@ -1344,6 +1394,7 @@ _gcry_aes_ocb_auth_armv8_ce: ldr r7, [sp, #(104+8)] ldr r5, [sp, #(104+0)] ldr r6, [sp, #(104+4)] + ldr lr, [sp, #(104+12)] cmp r7, #12 vld1.8 {q0}, [r2] /* load offset */ @@ -1356,6 +1407,7 @@ _gcry_aes_ocb_auth_armv8_ce: #define OCB_AUTH(bits, ...) \ .Locb_auth_entry_##bits: \ cmp r6, #4; \ + add lr, #1; \ blo .Locb_auth_loop_##bits; \ \ .Locb_auth_loop4_##bits: \ @@ -1363,7 +1415,23 @@ _gcry_aes_ocb_auth_armv8_ce: /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ \ /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ \ \ - ldm r5!, {r8, r9, r10, r11}; \ + add r9, lr, #1; \ + add r10, lr, #2; \ + add r11, lr, #3; \ + rbit r8, lr; \ + add lr, lr, #4; \ + rbit r9, r9; \ + rbit r10, r10; \ + rbit r11, r11; \ + clz r8, r8; /* ntz(i+0) */ \ + clz r9, r9; /* ntz(i+1) */ \ + clz r10, r10; /* ntz(i+2) */ \ + clz r11, r11; /* ntz(i+3) */ \ + add r8, r5, r8, lsl #4; \ + add r9, r5, r9, lsl #4; \ + add r10, r5, r10, lsl #4; \ + add r11, r5, r11, lsl #4; \ + \ sub r6, #4; \ \ vld1.8 {q9}, [r8]; /* load L_{ntz(i+0)} */ \ @@ -1401,8 +1469,12 @@ _gcry_aes_ocb_auth_armv8_ce: /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ \ /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ \ \ - ldr r8, [r5], #4; \ - vld1.8 {q2}, [r8]; /* load L_{ntz(i)} */ \ + rbit r8, lr; \ + add lr, #1; \ + clz r8, r8; /* ntz(i) */ \ + add r8, r5, r8, lsl #4; \ + \ + vld1.8 {q2}, [r8]; /* load L_{ntz(i)} */ \ vld1.8 {q1}, [r1]!; /* load aadtext */ \ subs r6, #1; \ veor q0, q0, q2; \ diff --git a/cipher/rijndael-armv8-aarch64-ce.S b/cipher/rijndael-armv8-aarch64-ce.S index 21d0aec..1ebb363 100644 --- a/cipher/rijndael-armv8-aarch64-ce.S +++ b/cipher/rijndael-armv8-aarch64-ce.S @@ -28,23 +28,6 @@ .text -#if (SIZEOF_VOID_P == 4) - #define ptr8 w8 - #define ptr9 w9 - #define ptr10 w10 - #define ptr11 w11 - #define ptr_sz 4 -#elif (SIZEOF_VOID_P == 8) - #define ptr8 x8 - #define ptr9 x9 - #define ptr10 x10 - #define ptr11 x11 - #define ptr_sz 8 -#else - #error "missing SIZEOF_VOID_P" -#endif - - #define GET_DATA_POINTER(reg, name) \ adrp reg, :got:name ; \ ldr reg, [reg, #:got_lo12:name] ; @@ -855,9 +838,10 @@ _gcry_aes_cfb_dec_armv8_ce: * const unsigned char *inbuf, * unsigned char *offset, * unsigned char *checksum, - * void **Ls, + * unsigned char *L_table, * size_t nblocks, - * unsigned int nrounds); + * unsigned int nrounds, + * unsigned int blkn); */ .align 3 @@ -870,11 +854,13 @@ _gcry_aes_ocb_enc_armv8_ce: * x2: inbuf * x3: offset * x4: checksum - * x5: Ls + * x5: Ltable * x6: nblocks (0 < nblocks <= 32) * w7: nrounds + * %st+0: blkn => w12 */ + ldr w12, [sp] ld1 {v0.16b}, [x3] /* load offset */ ld1 {v16.16b}, [x4] /* load checksum */ @@ -886,6 +872,7 @@ _gcry_aes_ocb_enc_armv8_ce: #define OCB_ENC(bits, ...) \ .Locb_enc_entry_##bits: \ cmp x6, #4; \ + add x12, x12, #1; \ b.lo .Locb_enc_loop_##bits; \ \ .Locb_enc_loop4_##bits: \ @@ -894,10 +881,24 @@ _gcry_aes_ocb_enc_armv8_ce: /* Checksum_i = Checksum_{i-1} xor P_i */ \ /* C_i = Offset_i xor ENCIPHER(K, P_i xor Offset_i) */ \ \ - ldp ptr8, ptr9, [x5], #(ptr_sz*2); \ + add w9, w12, #1; \ + add w10, w12, #2; \ + add w11, w12, #3; \ + rbit w8, w12; \ + add w12, w12, #4; \ + rbit w9, w9; \ + rbit w10, w10; \ + rbit w11, w11; \ + clz w8, w8; /* ntz(i+0) */ \ + clz w9, w9; /* ntz(i+1) */ \ + clz w10, w10; /* ntz(i+2) */ \ + clz w11, w11; /* ntz(i+3) */ \ + add x8, x5, x8, lsl #4; \ + ld1 {v1.16b-v4.16b}, [x2], #64; /* load P_i+<0-3> */ \ + add x9, x5, x9, lsl #4; \ + add x10, x5, x10, lsl #4; \ + add x11, x5, x11, lsl #4; \ \ - ld1 {v1.16b-v4.16b}, [x2], #64; /* load P_i+<0-3> */ \ - ldp ptr10, ptr11, [x5], #(ptr_sz*2); \ sub x6, x6, #4; \ \ ld1 {v5.16b}, [x8]; /* load L_{ntz(i+0)} */ \ @@ -940,7 +941,11 @@ _gcry_aes_ocb_enc_armv8_ce: /* Checksum_i = Checksum_{i-1} xor P_i */ \ /* C_i = Offset_i xor ENCIPHER(K, P_i xor Offset_i) */ \ \ - ldr ptr8, [x5], #(ptr_sz); \ + rbit x8, x12; \ + add x12, x12, #1; \ + clz x8, x8; /* ntz(i) */ \ + add x8, x5, x8, lsl #4; \ + \ ld1 {v1.16b}, [x2], #16; /* load plaintext */ \ ld1 {v2.16b}, [x8]; /* load L_{ntz(i)} */ \ sub x6, x6, #1; \ @@ -983,9 +988,10 @@ _gcry_aes_ocb_enc_armv8_ce: * const unsigned char *inbuf, * unsigned char *offset, * unsigned char *checksum, - * void **Ls, + * unsigned char *L_table, * size_t nblocks, - * unsigned int nrounds); + * unsigned int nrounds, + * unsigned int blkn); */ .align 3 @@ -998,11 +1004,13 @@ _gcry_aes_ocb_dec_armv8_ce: * x2: inbuf * x3: offset * x4: checksum - * x5: Ls + * x5: Ltable * x6: nblocks (0 < nblocks <= 32) * w7: nrounds + * %st+0: blkn => w12 */ + ldr w12, [sp] ld1 {v0.16b}, [x3] /* load offset */ ld1 {v16.16b}, [x4] /* load checksum */ @@ -1014,6 +1022,7 @@ _gcry_aes_ocb_dec_armv8_ce: #define OCB_DEC(bits) \ .Locb_dec_entry_##bits: \ cmp x6, #4; \ + add w12, w12, #1; \ b.lo .Locb_dec_loop_##bits; \ \ .Locb_dec_loop4_##bits: \ @@ -1022,10 +1031,24 @@ _gcry_aes_ocb_dec_armv8_ce: /* P_i = Offset_i xor DECIPHER(K, C_i xor Offset_i) */ \ /* Checksum_i = Checksum_{i-1} xor P_i */ \ \ - ldp ptr8, ptr9, [x5], #(ptr_sz*2); \ + add w9, w12, #1; \ + add w10, w12, #2; \ + add w11, w12, #3; \ + rbit w8, w12; \ + add w12, w12, #4; \ + rbit w9, w9; \ + rbit w10, w10; \ + rbit w11, w11; \ + clz w8, w8; /* ntz(i+0) */ \ + clz w9, w9; /* ntz(i+1) */ \ + clz w10, w10; /* ntz(i+2) */ \ + clz w11, w11; /* ntz(i+3) */ \ + add x8, x5, x8, lsl #4; \ + ld1 {v1.16b-v4.16b}, [x2], #64; /* load C_i+<0-3> */ \ + add x9, x5, x9, lsl #4; \ + add x10, x5, x10, lsl #4; \ + add x11, x5, x11, lsl #4; \ \ - ld1 {v1.16b-v4.16b}, [x2], #64; /* load C_i+<0-3> */ \ - ldp ptr10, ptr11, [x5], #(ptr_sz*2); \ sub x6, x6, #4; \ \ ld1 {v5.16b}, [x8]; /* load L_{ntz(i+0)} */ \ @@ -1068,7 +1091,11 @@ _gcry_aes_ocb_dec_armv8_ce: /* P_i = Offset_i xor DECIPHER(K, C_i xor Offset_i) */ \ /* Checksum_i = Checksum_{i-1} xor P_i */ \ \ - ldr ptr8, [x5], #(ptr_sz); \ + rbit w8, w12; \ + add w12, w12, #1; \ + clz w8, w8; /* ntz(i) */ \ + add x8, x5, x8, lsl #4; \ + \ ld1 {v1.16b}, [x2], #16; /* load ciphertext */ \ ld1 {v2.16b}, [x8]; /* load L_{ntz(i)} */ \ sub x6, x6, #1; \ @@ -1110,9 +1137,10 @@ _gcry_aes_ocb_dec_armv8_ce: * const unsigned char *abuf, * unsigned char *offset, * unsigned char *checksum, - * void **Ls, + * unsigned char *L_table, * size_t nblocks, - * unsigned int nrounds); + * unsigned int nrounds, + * unsigned int blkn); */ .align 3 @@ -1124,10 +1152,12 @@ _gcry_aes_ocb_auth_armv8_ce: * x1: abuf * x2: offset => x3 * x3: checksum => x4 - * x4: Ls => x5 + * x4: Ltable => x5 * x5: nblocks => x6 (0 < nblocks <= 32) * w6: nrounds => w7 + * w7: blkn => w12 */ + mov x12, x7 mov x7, x6 mov x6, x5 mov x5, x4 @@ -1145,6 +1175,7 @@ _gcry_aes_ocb_auth_armv8_ce: #define OCB_AUTH(bits) \ .Locb_auth_entry_##bits: \ cmp x6, #4; \ + add w12, w12, #1; \ b.lo .Locb_auth_loop_##bits; \ \ .Locb_auth_loop4_##bits: \ @@ -1152,10 +1183,24 @@ _gcry_aes_ocb_auth_armv8_ce: /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ \ /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ \ \ - ldp ptr8, ptr9, [x5], #(ptr_sz*2); \ + add w9, w12, #1; \ + add w10, w12, #2; \ + add w11, w12, #3; \ + rbit w8, w12; \ + add w12, w12, #4; \ + rbit w9, w9; \ + rbit w10, w10; \ + rbit w11, w11; \ + clz w8, w8; /* ntz(i+0) */ \ + clz w9, w9; /* ntz(i+1) */ \ + clz w10, w10; /* ntz(i+2) */ \ + clz w11, w11; /* ntz(i+3) */ \ + add x8, x5, x8, lsl #4; \ + ld1 {v1.16b-v4.16b}, [x1], #64; /* load A_i+<0-3> */ \ + add x9, x5, x9, lsl #4; \ + add x10, x5, x10, lsl #4; \ + add x11, x5, x11, lsl #4; \ \ - ld1 {v1.16b-v4.16b}, [x1], #64; /* load A_i+<0-3> */ \ - ldp ptr10, ptr11, [x5], #(ptr_sz*2); \ sub x6, x6, #4; \ \ ld1 {v5.16b}, [x8]; /* load L_{ntz(i+0)} */ \ @@ -1192,7 +1237,11 @@ _gcry_aes_ocb_auth_armv8_ce: /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ \ /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ \ \ - ldr ptr8, [x5], #(ptr_sz); \ + rbit w8, w12; \ + add w12, w12, #1; \ + clz w8, w8; /* ntz(i) */ \ + add x8, x5, x8, lsl #4; \ + \ ld1 {v1.16b}, [x1], #16; /* load aadtext */ \ ld1 {v2.16b}, [x8]; /* load L_{ntz(i)} */ \ sub x6, x6, #1; \ diff --git a/cipher/rijndael-armv8-ce.c b/cipher/rijndael-armv8-ce.c index 1bf74da..334cf68 100644 --- a/cipher/rijndael-armv8-ce.c +++ b/cipher/rijndael-armv8-ce.c @@ -80,30 +80,33 @@ extern void _gcry_aes_ocb_enc_armv8_ce (const void *keysched, const unsigned char *inbuf, unsigned char *offset, unsigned char *checksum, - void **Ls, + unsigned char *L_table, size_t nblocks, - unsigned int nrounds); + unsigned int nrounds, + unsigned int blkn); extern void _gcry_aes_ocb_dec_armv8_ce (const void *keysched, unsigned char *outbuf, const unsigned char *inbuf, unsigned char *offset, unsigned char *checksum, - void **Ls, + unsigned char *L_table, size_t nblocks, - unsigned int nrounds); + unsigned int nrounds, + unsigned int blkn); extern void _gcry_aes_ocb_auth_armv8_ce (const void *keysched, const unsigned char *abuf, unsigned char *offset, unsigned char *checksum, - void **Ls, + unsigned char *L_table, size_t nblocks, - unsigned int nrounds); + unsigned int nrounds, + unsigned int blkn); typedef void (*ocb_crypt_fn_t) (const void *keysched, unsigned char *outbuf, const unsigned char *inbuf, unsigned char *offset, unsigned char *checksum, - void **Ls, size_t nblocks, - unsigned int nrounds); + unsigned char *L_table, size_t nblocks, + unsigned int nrounds, unsigned int blkn); void _gcry_aes_armv8_ce_setkey (RIJNDAEL_context *ctx, const byte *key) @@ -334,62 +337,11 @@ _gcry_aes_armv8_ce_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, const unsigned char *inbuf = inbuf_arg; unsigned int nrounds = ctx->rounds; u64 blkn = c->u_mode.ocb.data_nblocks; - u64 blkn_offs = blkn - blkn % 32; - unsigned int n = 32 - blkn % 32; - void *Ls[32]; - void **l; - size_t i; c->u_mode.ocb.data_nblocks = blkn + nblocks; - if (nblocks >= 32) - { - for (i = 0; i < 32; i += 8) - { - Ls[(i + 0 + n) % 32] = (void *)c->u_mode.ocb.L[0]; - Ls[(i + 1 + n) % 32] = (void *)c->u_mode.ocb.L[1]; - Ls[(i + 2 + n) % 32] = (void *)c->u_mode.ocb.L[0]; - Ls[(i + 3 + n) % 32] = (void *)c->u_mode.ocb.L[2]; - Ls[(i + 4 + n) % 32] = (void *)c->u_mode.ocb.L[0]; - Ls[(i + 5 + n) % 32] = (void *)c->u_mode.ocb.L[1]; - Ls[(i + 6 + n) % 32] = (void *)c->u_mode.ocb.L[0]; - } - - Ls[(7 + n) % 32] = (void *)c->u_mode.ocb.L[3]; - Ls[(15 + n) % 32] = (void *)c->u_mode.ocb.L[4]; - Ls[(23 + n) % 32] = (void *)c->u_mode.ocb.L[3]; - l = &Ls[(31 + n) % 32]; - - /* Process data in 32 block chunks. */ - while (nblocks >= 32) - { - blkn_offs += 32; - *l = (void *)ocb_get_l(c, blkn_offs); - - crypt_fn(keysched, outbuf, inbuf, c->u_iv.iv, c->u_ctr.ctr, Ls, 32, - nrounds); - - nblocks -= 32; - outbuf += 32 * 16; - inbuf += 32 * 16; - } - - if (nblocks && l < &Ls[nblocks]) - { - *l = (void *)ocb_get_l(c, 32 + blkn_offs); - } - } - else - { - for (i = 0; i < nblocks; i++) - Ls[i] = (void *)ocb_get_l(c, ++blkn); - } - - if (nblocks) - { - crypt_fn(keysched, outbuf, inbuf, c->u_iv.iv, c->u_ctr.ctr, Ls, nblocks, - nrounds); - } + crypt_fn(keysched, outbuf, inbuf, c->u_iv.iv, c->u_ctr.ctr, + c->u_mode.ocb.L[0], nblocks, nrounds, (unsigned int)blkn); } void @@ -401,61 +353,12 @@ _gcry_aes_armv8_ce_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, const unsigned char *abuf = abuf_arg; unsigned int nrounds = ctx->rounds; u64 blkn = c->u_mode.ocb.aad_nblocks; - u64 blkn_offs = blkn - blkn % 32; - unsigned int n = 32 - blkn % 32; - void *Ls[32]; - void **l; - size_t i; c->u_mode.ocb.aad_nblocks = blkn + nblocks; - if (nblocks >= 32) - { - for (i = 0; i < 32; i += 8) - { - Ls[(i + 0 + n) % 32] = (void *)c->u_mode.ocb.L[0]; - Ls[(i + 1 + n) % 32] = (void *)c->u_mode.ocb.L[1]; - Ls[(i + 2 + n) % 32] = (void *)c->u_mode.ocb.L[0]; - Ls[(i + 3 + n) % 32] = (void *)c->u_mode.ocb.L[2]; - Ls[(i + 4 + n) % 32] = (void *)c->u_mode.ocb.L[0]; - Ls[(i + 5 + n) % 32] = (void *)c->u_mode.ocb.L[1]; - Ls[(i + 6 + n) % 32] = (void *)c->u_mode.ocb.L[0]; - } - - Ls[(7 + n) % 32] = (void *)c->u_mode.ocb.L[3]; - Ls[(15 + n) % 32] = (void *)c->u_mode.ocb.L[4]; - Ls[(23 + n) % 32] = (void *)c->u_mode.ocb.L[3]; - l = &Ls[(31 + n) % 32]; - - /* Process data in 32 block chunks. */ - while (nblocks >= 32) - { - blkn_offs += 32; - *l = (void *)ocb_get_l(c, blkn_offs); - - _gcry_aes_ocb_auth_armv8_ce(keysched, abuf, c->u_mode.ocb.aad_offset, - c->u_mode.ocb.aad_sum, Ls, 32, nrounds); - - nblocks -= 32; - abuf += 32 * 16; - } - - if (nblocks && l < &Ls[nblocks]) - { - *l = (void *)ocb_get_l(c, 32 + blkn_offs); - } - } - else - { - for (i = 0; i < nblocks; i++) - Ls[i] = (void *)ocb_get_l(c, ++blkn); - } - - if (nblocks) - { - _gcry_aes_ocb_auth_armv8_ce(keysched, abuf, c->u_mode.ocb.aad_offset, - c->u_mode.ocb.aad_sum, Ls, nblocks, nrounds); - } + _gcry_aes_ocb_auth_armv8_ce(keysched, abuf, c->u_mode.ocb.aad_offset, + c->u_mode.ocb.aad_sum, c->u_mode.ocb.L[0], + nblocks, nrounds, (unsigned int)blkn); } #endif /* USE_ARM_CE */ From jussi.kivilinna at iki.fi Mon Dec 5 15:14:53 2016 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 05 Dec 2016 16:14:53 +0200 Subject: [PATCH 1/2] tests/hashtest-256g: add missing executable extension for Win32 Message-ID: <148094729335.3589.4603798084666783646.stgit@localhost6.localdomain6> * tests/hashtest-256g.in: Add @EXEEXT at . -- Signed-off-by: Jussi Kivilinna --- tests/hashtest-256g.in | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/hashtest-256g.in b/tests/hashtest-256g.in index e897c54..92b1c1b 100755 --- a/tests/hashtest-256g.in +++ b/tests/hashtest-256g.in @@ -4,4 +4,4 @@ algos="SHA1 SHA256 SHA512" test "@RUN_LARGE_DATA_TESTS@" = yes || exit 77 echo " now running 256 GiB tests for $algos - this takes looong" -exec ./hashtest --gigs 256 $algos +exec ./hashtest at EXEEXT@ --gigs 256 $algos From jussi.kivilinna at iki.fi Mon Dec 5 15:14:58 2016 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 05 Dec 2016 16:14:58 +0200 Subject: [PATCH 2/2] hwfeatures: add 'all' for disabling all hardware features In-Reply-To: <148094729335.3589.4603798084666783646.stgit@localhost6.localdomain6> References: <148094729335.3589.4603798084666783646.stgit@localhost6.localdomain6> Message-ID: <148094729838.3589.6706421074198316536.stgit@localhost6.localdomain6> * .gitignore: Add 'tests/basic-disable-all-hwf'. * configure.ac: Ditto. * tests/Makefile.am: Ditto. * src/hwfeatures.c (_gcry_disable_hw_feature): Match 'all' for masking all HW features off. * tests/basic-disable-all-hwf.in: New. -- Also add new test to run 'basic' with all HWF disable. With current assembly implementations and build servers using new CPUs, generic implementations are not being tested enough anymore and compiler problems might end up unnoticed. Signed-off-by: Jussi Kivilinna --- .gitignore | 1 + configure.ac | 1 + src/hwfeatures.c | 16 +++++++--------- tests/Makefile.am | 7 ++++--- tests/basic-disable-all-hwf.in | 4 ++++ 5 files changed, 17 insertions(+), 12 deletions(-) create mode 100644 tests/basic-disable-all-hwf.in diff --git a/.gitignore b/.gitignore index 3cd83a2..5d481aa 100644 --- a/.gitignore +++ b/.gitignore @@ -73,6 +73,7 @@ tests/ac-data tests/ac-schemes tests/aeswrap tests/basic +tests/basic-disable-all-hwf tests/bench-slope tests/benchmark tests/curves diff --git a/configure.ac b/configure.ac index 7bbf4bd..998264c 100644 --- a/configure.ac +++ b/configure.ac @@ -2555,6 +2555,7 @@ src/versioninfo.rc tests/Makefile ]) AC_CONFIG_FILES([tests/hashtest-256g], [chmod +x tests/hashtest-256g]) +AC_CONFIG_FILES([tests/basic-disable-all-hwf], [chmod +x tests/basic-disable-all-hwf]) AC_OUTPUT diff --git a/src/hwfeatures.c b/src/hwfeatures.c index 07221e8..99aba34 100644 --- a/src/hwfeatures.c +++ b/src/hwfeatures.c @@ -83,6 +83,12 @@ _gcry_disable_hw_feature (const char *name) { int i; + if (!strcmp(name, "all")) + { + disabled_hw_features = ~0; + return 0; + } + for (i=0; i < DIM (hwflist); i++) if (!strcmp (hwflist[i].desc, name)) { @@ -159,15 +165,7 @@ parse_hwf_deny_file (void) if (!*p || *p == '#') continue; - for (i=0; i < DIM (hwflist); i++) - { - if (!strcmp (hwflist[i].desc, p)) - { - disabled_hw_features |= hwflist[i].flag; - break; - } - } - if (i == DIM (hwflist)) + if (_gcry_disable_hw_feature (p) == GPG_ERR_INV_NAME) { #ifdef HAVE_SYSLOG syslog (LOG_USER|LOG_WARNING, diff --git a/tests/Makefile.am b/tests/Makefile.am index d462f30..f428d7d 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -19,14 +19,14 @@ ## Process this file with automake to produce Makefile.in tests_bin = \ - version mpitests t-sexp t-convert \ + version mpitests t-sexp t-convert \ t-mpi-bit t-mpi-point curves t-lock \ prime basic keygen pubkey hmac hashtest t-kdf keygrip \ fips186-dsa aeswrap pkcs1v2 random dsa-rfc6979 t-ed25519 t-cv25519 tests_bin_last = benchmark bench-slope -tests_sh = +tests_sh = basic-disable-all-hwf tests_sh_last = hashtest-256g @@ -58,7 +58,8 @@ noinst_HEADERS = t-common.h EXTRA_DIST = README rsa-16k.key cavs_tests.sh cavs_driver.pl \ pkcs1v2-oaep.h pkcs1v2-pss.h pkcs1v2-v15c.h pkcs1v2-v15s.h \ t-ed25519.inp stopwatch.h hashtest-256g.in \ - sha3-224.h sha3-256.h sha3-384.h sha3-512.h + sha3-224.h sha3-256.h sha3-384.h sha3-512.h \ + basic-disable-all-hwf.in LDADD = $(standard_ldadd) $(GPG_ERROR_LIBS) t_lock_LDADD = $(standard_ldadd) $(GPG_ERROR_MT_LIBS) diff --git a/tests/basic-disable-all-hwf.in b/tests/basic-disable-all-hwf.in new file mode 100644 index 0000000..1f0a4de --- /dev/null +++ b/tests/basic-disable-all-hwf.in @@ -0,0 +1,4 @@ +#!/bin/sh + +echo " now running 'basic' test with all hardware features disabled." +exec ./basic at EXEEXT@ --disable-hwf all From smueller at chronox.de Mon Dec 5 15:16:41 2016 From: smueller at chronox.de (Stephan =?ISO-8859-1?Q?M=FCller?=) Date: Mon, 05 Dec 2016 15:16:41 +0100 Subject: [PATCH] random-drbg: use bufhelp function for big-endian store In-Reply-To: <148094725175.3391.15004688817915407064.stgit@localhost6.localdomain6> References: <148094725175.3391.15004688817915407064.stgit@localhost6.localdomain6> Message-ID: <1761763.9MeJz0GvvL@tauon.atsec.com> Am Montag, 5. Dezember 2016, 16:14:11 CET schrieb Jussi Kivilinna: Hi Jussi, > * random/random-drbg.c (drbg_cpu_to_be32): Use 'buf_put_be32' instead > of 'be_bswap32'. Instead of fixing drbg_cpu_to_be32, wouldn't it make more sense to remove drbg_cpu_to_be32 entirely and replace it with buf_put_be32 throughout the code? Thanks Ciao Stephan From jussi.kivilinna at iki.fi Mon Dec 5 15:37:03 2016 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 5 Dec 2016 16:37:03 +0200 Subject: [PATCH] random-drbg: use bufhelp function for big-endian store In-Reply-To: <1761763.9MeJz0GvvL@tauon.atsec.com> References: <148094725175.3391.15004688817915407064.stgit@localhost6.localdomain6> <1761763.9MeJz0GvvL@tauon.atsec.com> Message-ID: Hello, On 05.12.2016 16:16, Stephan M?ller wrote: > Am Montag, 5. Dezember 2016, 16:14:11 CET schrieb Jussi Kivilinna: > > Hi Jussi, > >> * random/random-drbg.c (drbg_cpu_to_be32): Use 'buf_put_be32' instead >> of 'be_bswap32'. > > Instead of fixing drbg_cpu_to_be32, wouldn't it make more sense to remove > drbg_cpu_to_be32 entirely and replace it with buf_put_be32 throughout the > code? Well, it does make more sense :) -Jussi -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 273 bytes Desc: OpenPGP digital signature URL: From jussi.kivilinna at iki.fi Mon Dec 5 15:44:58 2016 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 05 Dec 2016 16:44:58 +0200 Subject: [PATCH v2] random-drbg: use bufhelp function for big-endian store In-Reply-To: <1761763.9MeJz0GvvL@tauon.atsec.com> References: <1761763.9MeJz0GvvL@tauon.atsec.com> Message-ID: <148094909862.18418.13534879577959633743.stgit@localhost6.localdomain6> * random/random-drbg.c (drbg_cpu_to_be32): Remove. (drbg_ctr_df, drbg_hash_df): Use 'buf_put_be32' instead of 'drbg_cpu_to_be32'. -- Signed-off-by: Jussi Kivilinna --- 0 files changed diff --git a/random/random-drbg.c b/random/random-drbg.c index f9d11a3..4d3198e 100644 --- a/random/random-drbg.c +++ b/random/random-drbg.c @@ -155,7 +155,7 @@ #include "g10lib.h" #include "random.h" #include "rand-internal.h" -#include "../cipher/bithelp.h" +#include "../cipher/bufhelp.h" @@ -533,27 +533,6 @@ drbg_sec_strength (u32 flags) return 32; } -/* - * Convert an integer into a byte representation of this integer. - * The byte representation is big-endian - * - * @val value to be converted - * @buf buffer holding the converted integer -- caller must ensure that - * buffer size is at least 32 bit - */ -static inline void -drbg_cpu_to_be32 (u32 val, unsigned char *buf) -{ - /* FIXME: This may raise a bus error. */ - struct s - { - u32 conv; - }; - struct s *conversion = (struct s *) buf; - - conversion->conv = be_bswap32 (val); -} - static void drbg_add_buf (unsigned char *dst, size_t dstlen, unsigned char *add, size_t addlen) @@ -785,10 +764,10 @@ drbg_ctr_df (drbg_state_t drbg, unsigned char *df_data, /* 10.4.2 step 2 -- calculate the entire length of all input data */ for (; NULL != tempstr; tempstr = tempstr->next) inputlen += tempstr->len; - drbg_cpu_to_be32 (inputlen, &L_N[0]); + buf_put_be32 (&L_N[0], inputlen); /* 10.4.2 step 3 */ - drbg_cpu_to_be32 (bytes_to_return, &L_N[4]); + buf_put_be32 (&L_N[4], bytes_to_return); /* 10.4.2 step 5: length is size of L_N, input_string, one byte, padding */ padlen = (inputlen + sizeof (L_N) + 1) % (drbg_blocklen (drbg)); @@ -821,7 +800,7 @@ drbg_ctr_df (drbg_state_t drbg, unsigned char *df_data, /* 10.4.2 step 9.1 - the padding is implicit as the buffer * holds zeros after allocation -- even the increment of i * is irrelevant as the increment remains within length of i */ - drbg_cpu_to_be32 (i, iv); + buf_put_be32 (iv, i); /* 10.4.2 step 9.2 -- BCC and concatenation with temp */ ret = drbg_ctr_bcc (drbg, temp + templen, K, &S1); if (ret) @@ -1139,7 +1118,7 @@ drbg_hash_df (drbg_state_t drbg, /* 10.4.1 step 3 */ input[0] = 1; - drbg_cpu_to_be32 ((outlen * 8), &input[1]); + buf_put_be32 (&input[1], (outlen * 8)); /* 10.4.1 step 4.1 -- concatenation of data for input into hash */ drbg_string_fill (&data1, input, 5); From jussi.kivilinna at iki.fi Mon Dec 5 15:14:29 2016 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 05 Dec 2016 16:14:29 +0200 Subject: [PATCH 2/3] OCB: Move large L handling from bottom to upper level In-Reply-To: <148094726427.3477.8483024518816386622.stgit@localhost6.localdomain6> References: <148094726427.3477.8483024518816386622.stgit@localhost6.localdomain6> Message-ID: <148094726929.3477.223503718383888327.stgit@localhost6.localdomain6> * cipher/cipher-ocb.c (_gcry_cipher_ocb_get_l): Remove. (ocb_get_L_big): New. (_gcry_cipher_ocb_authenticate): L-big handling done in upper processing loop, so that lower level never sees the case where 'aad_nblocks % 65536 == 0'; Add missing stack burn. (ocb_aad_finalize): Add missing stack burn. (ocb_crypt): L-big handling done in upper processing loop, so that lower level never sees the case where 'data_nblocks % 65536 == 0'. * cipher/cipher-internal.h (_gcry_cipher_ocb_get_l): Remove. (ocb_get_l): Remove 'l_tmp' usage and simplify since input is more limited now, 'N is not multiple of 65536'. * cipher/rijndael-aesni.c (get_l): Remove. (aesni_ocb_enc, aesni_ocb_dec, _gcry_aes_aesni_ocb_auth): Remove l_tmp; Use 'ocb_get_l'. * cipher/rijndael-ssse3-amd64.c (get_l): Remove. (ssse3_ocb_enc, ssse3_ocb_dec, _gcry_aes_ssse3_ocb_auth): Remove l_tmp; Use 'ocb_get_l'. * cipher/camellia-glue.c: Remove OCB l_tmp usage. * cipher/rijndael-armv8-ce.c: Ditto. * cipher/rijndael.c: Ditto. * cipher/serpent.c: Ditto. * cipher/twofish.c: Ditto. -- Move large L value generation to up-most level to simplify lower level ocb_get_l for greater performance and simpler implementation. This helps implementing OCB in assembly as 'ocb_get_l' no longer has function call on slow-path. Signed-off-by: Jussi Kivilinna --- cipher/camellia-glue.c | 18 +-- cipher/cipher-internal.h | 36 +++-- cipher/cipher-ocb.c | 271 +++++++++++++++++++++++++++++------------ cipher/rijndael-aesni.c | 96 +-------------- cipher/rijndael-armv8-ce.c | 20 +-- cipher/rijndael-ssse3-amd64.c | 96 --------------- cipher/rijndael.c | 6 - cipher/serpent.c | 24 +--- cipher/twofish.c | 20 +-- 9 files changed, 248 insertions(+), 339 deletions(-) diff --git a/cipher/camellia-glue.c b/cipher/camellia-glue.c index 1be35c9..7687094 100644 --- a/cipher/camellia-glue.c +++ b/cipher/camellia-glue.c @@ -619,7 +619,6 @@ _gcry_camellia_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, CAMELLIA_context *ctx = (void *)&c->context.c; unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; - unsigned char l_tmp[CAMELLIA_BLOCK_SIZE]; int burn_stack_depth; u64 blkn = c->u_mode.ocb.data_nblocks; @@ -664,9 +663,8 @@ _gcry_camellia_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, /* Process data in 32 block chunks. */ while (nblocks >= 32) { - /* l_tmp will be used only every 65536-th block. */ blkn += 32; - *l = (uintptr_t)(void *)ocb_get_l(c, l_tmp, blkn - blkn % 32); + *l = (uintptr_t)(void *)ocb_get_l(c, blkn - blkn % 32); if (encrypt) _gcry_camellia_aesni_avx2_ocb_enc(ctx, outbuf, inbuf, c->u_iv.iv, @@ -725,9 +723,8 @@ _gcry_camellia_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, /* Process data in 16 block chunks. */ while (nblocks >= 16) { - /* l_tmp will be used only every 65536-th block. */ blkn += 16; - *l = (uintptr_t)(void *)ocb_get_l(c, l_tmp, blkn - blkn % 16); + *l = (uintptr_t)(void *)ocb_get_l(c, blkn - blkn % 16); if (encrypt) _gcry_camellia_aesni_avx_ocb_enc(ctx, outbuf, inbuf, c->u_iv.iv, @@ -759,8 +756,6 @@ _gcry_camellia_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, #if defined(USE_AESNI_AVX) || defined(USE_AESNI_AVX2) c->u_mode.ocb.data_nblocks = blkn; - wipememory(&l_tmp, sizeof(l_tmp)); - if (burn_stack_depth) _gcry_burn_stack (burn_stack_depth + 4 * sizeof(void *)); #endif @@ -776,7 +771,6 @@ _gcry_camellia_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, #if defined(USE_AESNI_AVX) || defined(USE_AESNI_AVX2) CAMELLIA_context *ctx = (void *)&c->context.c; const unsigned char *abuf = abuf_arg; - unsigned char l_tmp[CAMELLIA_BLOCK_SIZE]; int burn_stack_depth; u64 blkn = c->u_mode.ocb.aad_nblocks; @@ -818,9 +812,8 @@ _gcry_camellia_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, /* Process data in 32 block chunks. */ while (nblocks >= 32) { - /* l_tmp will be used only every 65536-th block. */ blkn += 32; - *l = (uintptr_t)(void *)ocb_get_l(c, l_tmp, blkn - blkn % 32); + *l = (uintptr_t)(void *)ocb_get_l(c, blkn - blkn % 32); _gcry_camellia_aesni_avx2_ocb_auth(ctx, abuf, c->u_mode.ocb.aad_offset, @@ -875,9 +868,8 @@ _gcry_camellia_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, /* Process data in 16 block chunks. */ while (nblocks >= 16) { - /* l_tmp will be used only every 65536-th block. */ blkn += 16; - *l = (uintptr_t)(void *)ocb_get_l(c, l_tmp, blkn - blkn % 16); + *l = (uintptr_t)(void *)ocb_get_l(c, blkn - blkn % 16); _gcry_camellia_aesni_avx_ocb_auth(ctx, abuf, c->u_mode.ocb.aad_offset, @@ -905,8 +897,6 @@ _gcry_camellia_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, #if defined(USE_AESNI_AVX) || defined(USE_AESNI_AVX2) c->u_mode.ocb.aad_nblocks = blkn; - wipememory(&l_tmp, sizeof(l_tmp)); - if (burn_stack_depth) _gcry_burn_stack (burn_stack_depth + 4 * sizeof(void *)); #endif diff --git a/cipher/cipher-internal.h b/cipher/cipher-internal.h index 01352f3..7204d48 100644 --- a/cipher/cipher-internal.h +++ b/cipher/cipher-internal.h @@ -459,28 +459,28 @@ gcry_err_code_t _gcry_cipher_ocb_get_tag gcry_err_code_t _gcry_cipher_ocb_check_tag /* */ (gcry_cipher_hd_t c, const unsigned char *intag, size_t taglen); -const unsigned char *_gcry_cipher_ocb_get_l -/* */ (gcry_cipher_hd_t c, unsigned char *l_tmp, u64 n); -/* Inline version of _gcry_cipher_ocb_get_l, with hard-coded fast paths for - most common cases. */ +/* Return the L-value for block N. Note: 'cipher_ocb.c' ensures that N + * will never be multiple of 65536 (1 << OCB_L_TABLE_SIZE), thus N can + * be directly passed to _gcry_ctz() function and resulting index will + * never overflow the table. */ static inline const unsigned char * -ocb_get_l (gcry_cipher_hd_t c, unsigned char *l_tmp, u64 n) +ocb_get_l (gcry_cipher_hd_t c, u64 n) { - if (n & 1) - return c->u_mode.ocb.L[0]; - else if (n & 2) - return c->u_mode.ocb.L[1]; - else - { - unsigned int ntz = _gcry_ctz64 (n); - - if (ntz < OCB_L_TABLE_SIZE) - return c->u_mode.ocb.L[ntz]; - else - return _gcry_cipher_ocb_get_l (c, l_tmp, n); - } + unsigned long ntz; + +#if ((defined(__i386__) || defined(__x86_64__)) && __GNUC__ >= 4) + /* Assumes that N != 0. */ + asm ("rep;bsfl %k[low], %k[ntz]\n\t" + : [ntz] "=r" (ntz) + : [low] "r" ((unsigned long)n) + : "cc"); +#else + ntz = _gcry_ctz (n); +#endif + + return c->u_mode.ocb.L[ntz]; } #endif /*G10_CIPHER_INTERNAL_H*/ diff --git a/cipher/cipher-ocb.c b/cipher/cipher-ocb.c index d1f01d5..db42aaf 100644 --- a/cipher/cipher-ocb.c +++ b/cipher/cipher-ocb.c @@ -109,25 +109,17 @@ bit_copy (unsigned char *d, const unsigned char *s, } -/* Return the L-value for block N. In most cases we use the table; - only if the lower OCB_L_TABLE_SIZE bits of N are zero we need to - compute it. With a table size of 16 we need to this this only - every 65536-th block. L_TMP is a helper buffer of size - OCB_BLOCK_LEN which is used to hold the computation if not taken - from the table. */ -const unsigned char * -_gcry_cipher_ocb_get_l (gcry_cipher_hd_t c, unsigned char *l_tmp, u64 n) +/* Get L_big value for block N, where N is multiple of 65536. */ +static void +ocb_get_L_big (gcry_cipher_hd_t c, u64 n, unsigned char *l_buf) { int ntz = _gcry_ctz64 (n); - if (ntz < OCB_L_TABLE_SIZE) - return c->u_mode.ocb.L[ntz]; + gcry_assert(ntz >= OCB_L_TABLE_SIZE); - double_block_cpy (l_tmp, c->u_mode.ocb.L[OCB_L_TABLE_SIZE - 1]); + double_block_cpy (l_buf, c->u_mode.ocb.L[OCB_L_TABLE_SIZE - 1]); for (ntz -= OCB_L_TABLE_SIZE; ntz; ntz--) - double_block (l_tmp); - - return l_tmp; + double_block (l_buf); } @@ -241,7 +233,11 @@ gcry_err_code_t _gcry_cipher_ocb_authenticate (gcry_cipher_hd_t c, const unsigned char *abuf, size_t abuflen) { + const size_t table_maxblks = 1 << OCB_L_TABLE_SIZE; + const u32 table_size_mask = ((1 << OCB_L_TABLE_SIZE) - 1); unsigned char l_tmp[OCB_BLOCK_LEN]; + unsigned int burn = 0; + unsigned int nburn; /* Check that a nonce and thus a key has been set and that we have not yet computed the tag. We also return an error if the aad has @@ -264,14 +260,24 @@ _gcry_cipher_ocb_authenticate (gcry_cipher_hd_t c, const unsigned char *abuf, { c->u_mode.ocb.aad_nblocks++; + if ((c->u_mode.ocb.aad_nblocks % table_maxblks) == 0) + { + /* Table overflow, L needs to be generated. */ + ocb_get_L_big(c, c->u_mode.ocb.aad_nblocks + 1, l_tmp); + } + else + { + buf_cpy (l_tmp, ocb_get_l (c, c->u_mode.ocb.aad_nblocks), + OCB_BLOCK_LEN); + } + /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ - buf_xor_1 (c->u_mode.ocb.aad_offset, - ocb_get_l (c, l_tmp, c->u_mode.ocb.aad_nblocks), - OCB_BLOCK_LEN); + buf_xor_1 (c->u_mode.ocb.aad_offset, l_tmp, OCB_BLOCK_LEN); /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ buf_xor (l_tmp, c->u_mode.ocb.aad_offset, c->u_mode.ocb.aad_leftover, OCB_BLOCK_LEN); - c->spec->encrypt (&c->context.c, l_tmp, l_tmp); + nburn = c->spec->encrypt (&c->context.c, l_tmp, l_tmp); + burn = nburn > burn ? nburn : burn; buf_xor_1 (c->u_mode.ocb.aad_sum, l_tmp, OCB_BLOCK_LEN); c->u_mode.ocb.aad_nleftover = 0; @@ -279,40 +285,83 @@ _gcry_cipher_ocb_authenticate (gcry_cipher_hd_t c, const unsigned char *abuf, } if (!abuflen) - return 0; - - /* Use a bulk method if available. */ - if (abuflen >= OCB_BLOCK_LEN && c->bulk.ocb_auth) { - size_t nblks; - size_t nleft; - size_t ndone; + if (burn > 0) + _gcry_burn_stack (burn + 4*sizeof(void*)); - nblks = abuflen / OCB_BLOCK_LEN; - nleft = c->bulk.ocb_auth (c, abuf, nblks); - ndone = nblks - nleft; - - abuf += ndone * OCB_BLOCK_LEN; - abuflen -= ndone * OCB_BLOCK_LEN; - nblks = nleft; + return 0; } - /* Hash all full blocks. */ + /* Full blocks handling. */ while (abuflen >= OCB_BLOCK_LEN) { - c->u_mode.ocb.aad_nblocks++; + size_t nblks = abuflen / OCB_BLOCK_LEN; + size_t nmaxblks; - /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ - buf_xor_1 (c->u_mode.ocb.aad_offset, - ocb_get_l (c, l_tmp, c->u_mode.ocb.aad_nblocks), - OCB_BLOCK_LEN); - /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ - buf_xor (l_tmp, c->u_mode.ocb.aad_offset, abuf, OCB_BLOCK_LEN); - c->spec->encrypt (&c->context.c, l_tmp, l_tmp); - buf_xor_1 (c->u_mode.ocb.aad_sum, l_tmp, OCB_BLOCK_LEN); + /* Check how many blocks to process till table overflow. */ + nmaxblks = (c->u_mode.ocb.aad_nblocks + 1) % table_maxblks; + nmaxblks = (table_maxblks - nmaxblks) % table_maxblks; + + if (nmaxblks == 0) + { + /* Table overflow, generate L and process one block. */ + c->u_mode.ocb.aad_nblocks++; + ocb_get_L_big(c, c->u_mode.ocb.aad_nblocks, l_tmp); + + /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ + buf_xor_1 (c->u_mode.ocb.aad_offset, l_tmp, OCB_BLOCK_LEN); + /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ + buf_xor (l_tmp, c->u_mode.ocb.aad_offset, abuf, OCB_BLOCK_LEN); + nburn = c->spec->encrypt (&c->context.c, l_tmp, l_tmp); + burn = nburn > burn ? nburn : burn; + buf_xor_1 (c->u_mode.ocb.aad_sum, l_tmp, OCB_BLOCK_LEN); + + abuf += OCB_BLOCK_LEN; + abuflen -= OCB_BLOCK_LEN; + nblks--; + + /* With overflow handled, retry loop again. Next overflow will + * happen after 65535 blocks. */ + continue; + } + + nblks = nblks < nmaxblks ? nblks : nmaxblks; + + /* Use a bulk method if available. */ + if (nblks && c->bulk.ocb_auth) + { + size_t nleft; + size_t ndone; + + nleft = c->bulk.ocb_auth (c, abuf, nblks); + ndone = nblks - nleft; + + abuf += ndone * OCB_BLOCK_LEN; + abuflen -= ndone * OCB_BLOCK_LEN; + nblks = nleft; + } + + /* Hash all full blocks. */ + while (nblks) + { + c->u_mode.ocb.aad_nblocks++; + + gcry_assert(c->u_mode.ocb.aad_nblocks & table_size_mask); + + /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ + buf_xor_1 (c->u_mode.ocb.aad_offset, + ocb_get_l (c, c->u_mode.ocb.aad_nblocks), + OCB_BLOCK_LEN); + /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ + buf_xor (l_tmp, c->u_mode.ocb.aad_offset, abuf, OCB_BLOCK_LEN); + nburn = c->spec->encrypt (&c->context.c, l_tmp, l_tmp); + burn = nburn > burn ? nburn : burn; + buf_xor_1 (c->u_mode.ocb.aad_sum, l_tmp, OCB_BLOCK_LEN); - abuf += OCB_BLOCK_LEN; - abuflen -= OCB_BLOCK_LEN; + abuf += OCB_BLOCK_LEN; + abuflen -= OCB_BLOCK_LEN; + nblks--; + } } /* Store away the remaining data. */ @@ -321,6 +370,9 @@ _gcry_cipher_ocb_authenticate (gcry_cipher_hd_t c, const unsigned char *abuf, c->u_mode.ocb.aad_leftover[c->u_mode.ocb.aad_nleftover++] = *abuf; gcry_assert (!abuflen); + if (burn > 0) + _gcry_burn_stack (burn + 4*sizeof(void*)); + return 0; } @@ -330,6 +382,8 @@ static void ocb_aad_finalize (gcry_cipher_hd_t c) { unsigned char l_tmp[OCB_BLOCK_LEN]; + unsigned int burn = 0; + unsigned int nburn; /* Check that a nonce and thus a key has been set and that we have not yet computed the tag. We also skip this if the aad has been @@ -352,7 +406,8 @@ ocb_aad_finalize (gcry_cipher_hd_t c) l_tmp[c->u_mode.ocb.aad_nleftover] = 0x80; buf_xor_1 (l_tmp, c->u_mode.ocb.aad_offset, OCB_BLOCK_LEN); /* Sum = Sum_m xor ENCIPHER(K, CipherInput) */ - c->spec->encrypt (&c->context.c, l_tmp, l_tmp); + nburn = c->spec->encrypt (&c->context.c, l_tmp, l_tmp); + burn = nburn > burn ? nburn : burn; buf_xor_1 (c->u_mode.ocb.aad_sum, l_tmp, OCB_BLOCK_LEN); c->u_mode.ocb.aad_nleftover = 0; @@ -361,6 +416,9 @@ ocb_aad_finalize (gcry_cipher_hd_t c) /* Mark AAD as finalized so that gcry_cipher_ocb_authenticate can * return an erro when called again. */ c->u_mode.ocb.aad_finalized = 1; + + if (burn > 0) + _gcry_burn_stack (burn + 4*sizeof(void*)); } @@ -387,10 +445,13 @@ ocb_crypt (gcry_cipher_hd_t c, int encrypt, unsigned char *outbuf, size_t outbuflen, const unsigned char *inbuf, size_t inbuflen) { + const size_t table_maxblks = 1 << OCB_L_TABLE_SIZE; + const u32 table_size_mask = ((1 << OCB_L_TABLE_SIZE) - 1); unsigned char l_tmp[OCB_BLOCK_LEN]; unsigned int burn = 0; unsigned int nburn; - size_t nblks = inbuflen / OCB_BLOCK_LEN; + gcry_cipher_encrypt_t crypt_fn = + encrypt ? c->spec->encrypt : c->spec->decrypt; /* Check that a nonce and thus a key has been set and that we are not yet in end of data state. */ @@ -407,58 +468,112 @@ ocb_crypt (gcry_cipher_hd_t c, int encrypt, else if ((inbuflen % OCB_BLOCK_LEN)) return GPG_ERR_INV_LENGTH; /* We support only full blocks for now. */ - /* Use a bulk method if available. */ - if (nblks && c->bulk.ocb_crypt) - { - size_t nleft; - size_t ndone; - - nleft = c->bulk.ocb_crypt (c, outbuf, inbuf, nblks, encrypt); - ndone = nblks - nleft; - - inbuf += ndone * OCB_BLOCK_LEN; - outbuf += ndone * OCB_BLOCK_LEN; - inbuflen -= ndone * OCB_BLOCK_LEN; - outbuflen -= ndone * OCB_BLOCK_LEN; - nblks = nleft; - } - - if (nblks) + /* Full blocks handling. */ + while (inbuflen >= OCB_BLOCK_LEN) { - gcry_cipher_encrypt_t crypt_fn = - encrypt ? c->spec->encrypt : c->spec->decrypt; + size_t nblks = inbuflen / OCB_BLOCK_LEN; + size_t nmaxblks; - if (encrypt) - { - /* Checksum_i = Checksum_{i-1} xor P_i */ - ocb_checksum (c->u_ctr.ctr, inbuf, nblks); - } + /* Check how many blocks to process till table overflow. */ + nmaxblks = (c->u_mode.ocb.data_nblocks + 1) % table_maxblks; + nmaxblks = (table_maxblks - nmaxblks) % table_maxblks; - /* Encrypt all full blocks. */ - while (inbuflen >= OCB_BLOCK_LEN) + if (nmaxblks == 0) { + /* Table overflow, generate L and process one block. */ c->u_mode.ocb.data_nblocks++; + ocb_get_L_big(c, c->u_mode.ocb.data_nblocks, l_tmp); + + if (encrypt) + { + /* Checksum_i = Checksum_{i-1} xor P_i */ + ocb_checksum (c->u_ctr.ctr, inbuf, 1); + } /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ - buf_xor_1 (c->u_iv.iv, - ocb_get_l (c, l_tmp, c->u_mode.ocb.data_nblocks), - OCB_BLOCK_LEN); + buf_xor_1 (c->u_iv.iv, l_tmp, OCB_BLOCK_LEN); /* C_i = Offset_i xor ENCIPHER(K, P_i xor Offset_i) */ buf_xor (outbuf, c->u_iv.iv, inbuf, OCB_BLOCK_LEN); nburn = crypt_fn (&c->context.c, outbuf, outbuf); burn = nburn > burn ? nburn : burn; buf_xor_1 (outbuf, c->u_iv.iv, OCB_BLOCK_LEN); + if (!encrypt) + { + /* Checksum_i = Checksum_{i-1} xor P_i */ + ocb_checksum (c->u_ctr.ctr, outbuf, 1); + } + inbuf += OCB_BLOCK_LEN; inbuflen -= OCB_BLOCK_LEN; outbuf += OCB_BLOCK_LEN; outbuflen =- OCB_BLOCK_LEN; + nblks--; + + /* With overflow handled, retry loop again. Next overflow will + * happen after 65535 blocks. */ + continue; + } + + nblks = nblks < nmaxblks ? nblks : nmaxblks; + + /* Use a bulk method if available. */ + if (nblks && c->bulk.ocb_crypt) + { + size_t nleft; + size_t ndone; + + nleft = c->bulk.ocb_crypt (c, outbuf, inbuf, nblks, encrypt); + ndone = nblks - nleft; + + inbuf += ndone * OCB_BLOCK_LEN; + outbuf += ndone * OCB_BLOCK_LEN; + inbuflen -= ndone * OCB_BLOCK_LEN; + outbuflen -= ndone * OCB_BLOCK_LEN; + nblks = nleft; } - if (!encrypt) + if (nblks) { - /* Checksum_i = Checksum_{i-1} xor P_i */ - ocb_checksum (c->u_ctr.ctr, outbuf - nblks * OCB_BLOCK_LEN, nblks); + size_t nblks_chksum = nblks; + + if (encrypt) + { + /* Checksum_i = Checksum_{i-1} xor P_i */ + ocb_checksum (c->u_ctr.ctr, inbuf, nblks_chksum); + } + + /* Encrypt all full blocks. */ + while (nblks) + { + c->u_mode.ocb.data_nblocks++; + + gcry_assert(c->u_mode.ocb.data_nblocks & table_size_mask); + + /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ + buf_xor_1 (c->u_iv.iv, + ocb_get_l (c, c->u_mode.ocb.data_nblocks), + OCB_BLOCK_LEN); + /* C_i = Offset_i xor ENCIPHER(K, P_i xor Offset_i) */ + buf_xor (outbuf, c->u_iv.iv, inbuf, OCB_BLOCK_LEN); + nburn = crypt_fn (&c->context.c, outbuf, outbuf); + burn = nburn > burn ? nburn : burn; + buf_xor_1 (outbuf, c->u_iv.iv, OCB_BLOCK_LEN); + + inbuf += OCB_BLOCK_LEN; + inbuflen -= OCB_BLOCK_LEN; + outbuf += OCB_BLOCK_LEN; + outbuflen =- OCB_BLOCK_LEN; + nblks--; + } + + if (!encrypt) + { + /* Checksum_i = Checksum_{i-1} xor P_i */ + ocb_checksum (c->u_ctr.ctr, + outbuf - nblks_chksum * OCB_BLOCK_LEN, + nblks_chksum); + } } } diff --git a/cipher/rijndael-aesni.c b/cipher/rijndael-aesni.c index 8b28b3a..7852e19 100644 --- a/cipher/rijndael-aesni.c +++ b/cipher/rijndael-aesni.c @@ -1331,74 +1331,10 @@ _gcry_aes_aesni_cbc_dec (RIJNDAEL_context *ctx, unsigned char *outbuf, } -static inline const unsigned char * -get_l (gcry_cipher_hd_t c, unsigned char *l_tmp, u64 i, unsigned char *iv, - unsigned char *ctr) -{ - const unsigned char *l; - unsigned int ntz; - - if (i & 0xffffffffU) - { - asm ("rep;bsf %k[low], %k[ntz]\n\t" - : [ntz] "=r" (ntz) - : [low] "r" (i & 0xffffffffU) - : "cc"); - } - else - { - if (OCB_L_TABLE_SIZE < 32) - { - ntz = 32; - } - else if (i) - { - asm ("rep;bsf %k[high], %k[ntz]\n\t" - : [ntz] "=r" (ntz) - : [high] "r" (i >> 32) - : "cc"); - ntz += 32; - } - else - { - ntz = 64; - } - } - - if (ntz < OCB_L_TABLE_SIZE) - { - l = c->u_mode.ocb.L[ntz]; - } - else - { - /* Store Offset & Checksum before calling external function */ - asm volatile ("movdqu %%xmm5, %[iv]\n\t" - "movdqu %%xmm6, %[ctr]\n\t" - : [iv] "=m" (*iv), - [ctr] "=m" (*ctr) - : - : "memory" ); - - l = _gcry_cipher_ocb_get_l (c, l_tmp, i); - - /* Restore Offset & Checksum */ - asm volatile ("movdqu %[iv], %%xmm5\n\t" - "movdqu %[ctr], %%xmm6\n\t" - : /* No output */ - : [iv] "m" (*iv), - [ctr] "m" (*ctr) - : "memory" ); - } - - return l; -} - - static void aesni_ocb_enc (gcry_cipher_hd_t c, void *outbuf_arg, const void *inbuf_arg, size_t nblocks) { - union { unsigned char x1[16] ATTR_ALIGNED_16; u32 x32[4]; } l_tmp; RIJNDAEL_context *ctx = (void *)&c->context.c; unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; @@ -1420,7 +1356,7 @@ aesni_ocb_enc (gcry_cipher_hd_t c, void *outbuf_arg, for ( ;nblocks && n % 4; nblocks-- ) { - l = get_l(c, l_tmp.x1, ++n, c->u_iv.iv, c->u_ctr.ctr); + l = ocb_get_l(c, ++n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* Checksum_i = Checksum_{i-1} xor P_i */ @@ -1449,9 +1385,8 @@ aesni_ocb_enc (gcry_cipher_hd_t c, void *outbuf_arg, for ( ;nblocks > 3 ; nblocks -= 4 ) { - /* l_tmp will be used only every 65536-th block. */ n += 4; - l = get_l(c, l_tmp.x1, n, c->u_iv.iv, c->u_ctr.ctr); + l = ocb_get_l(c, n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* Checksum_i = Checksum_{i-1} xor P_i */ @@ -1522,7 +1457,7 @@ aesni_ocb_enc (gcry_cipher_hd_t c, void *outbuf_arg, for ( ;nblocks; nblocks-- ) { - l = get_l(c, l_tmp.x1, ++n, c->u_iv.iv, c->u_ctr.ctr); + l = ocb_get_l(c, ++n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* Checksum_i = Checksum_{i-1} xor P_i */ @@ -1559,8 +1494,6 @@ aesni_ocb_enc (gcry_cipher_hd_t c, void *outbuf_arg, aesni_cleanup (); aesni_cleanup_2_6 (); - - wipememory(&l_tmp, sizeof(l_tmp)); } @@ -1568,7 +1501,6 @@ static void aesni_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, const void *inbuf_arg, size_t nblocks) { - union { unsigned char x1[16] ATTR_ALIGNED_16; u32 x32[4]; } l_tmp; RIJNDAEL_context *ctx = (void *)&c->context.c; unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; @@ -1589,7 +1521,7 @@ aesni_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, for ( ;nblocks && n % 4; nblocks-- ) { - l = get_l(c, l_tmp.x1, ++n, c->u_iv.iv, c->u_ctr.ctr); + l = ocb_get_l(c, ++n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* P_i = Offset_i xor DECIPHER(K, C_i xor Offset_i) */ @@ -1618,9 +1550,8 @@ aesni_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, for ( ;nblocks > 3 ; nblocks -= 4 ) { - /* l_tmp will be used only every 65536-th block. */ n += 4; - l = get_l(c, l_tmp.x1, n, c->u_iv.iv, c->u_ctr.ctr); + l = ocb_get_l(c, n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* P_i = Offset_i xor DECIPHER(K, C_i xor Offset_i) */ @@ -1691,7 +1622,7 @@ aesni_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, for ( ;nblocks; nblocks-- ) { - l = get_l(c, l_tmp.x1, ++n, c->u_iv.iv, c->u_ctr.ctr); + l = ocb_get_l(c, ++n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* P_i = Offset_i xor DECIPHER(K, C_i xor Offset_i) */ @@ -1728,8 +1659,6 @@ aesni_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, aesni_cleanup (); aesni_cleanup_2_6 (); - - wipememory(&l_tmp, sizeof(l_tmp)); } @@ -1748,7 +1677,6 @@ void _gcry_aes_aesni_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, size_t nblocks) { - union { unsigned char x1[16] ATTR_ALIGNED_16; u32 x32[4]; } l_tmp; RIJNDAEL_context *ctx = (void *)&c->context.c; const unsigned char *abuf = abuf_arg; u64 n = c->u_mode.ocb.aad_nblocks; @@ -1768,8 +1696,7 @@ _gcry_aes_aesni_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, for ( ;nblocks && n % 4; nblocks-- ) { - l = get_l(c, l_tmp.x1, ++n, c->u_mode.ocb.aad_offset, - c->u_mode.ocb.aad_sum); + l = ocb_get_l(c, ++n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ @@ -1794,10 +1721,8 @@ _gcry_aes_aesni_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, for ( ;nblocks > 3 ; nblocks -= 4 ) { - /* l_tmp will be used only every 65536-th block. */ n += 4; - l = get_l(c, l_tmp.x1, n, c->u_mode.ocb.aad_offset, - c->u_mode.ocb.aad_sum); + l = ocb_get_l(c, n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ @@ -1849,8 +1774,7 @@ _gcry_aes_aesni_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, for ( ;nblocks; nblocks-- ) { - l = get_l(c, l_tmp.x1, ++n, c->u_mode.ocb.aad_offset, - c->u_mode.ocb.aad_sum); + l = ocb_get_l(c, ++n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ @@ -1883,8 +1807,6 @@ _gcry_aes_aesni_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, aesni_cleanup (); aesni_cleanup_2_6 (); - - wipememory(&l_tmp, sizeof(l_tmp)); } diff --git a/cipher/rijndael-armv8-ce.c b/cipher/rijndael-armv8-ce.c index bed4066..1bf74da 100644 --- a/cipher/rijndael-armv8-ce.c +++ b/cipher/rijndael-armv8-ce.c @@ -336,7 +336,6 @@ _gcry_aes_armv8_ce_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, u64 blkn = c->u_mode.ocb.data_nblocks; u64 blkn_offs = blkn - blkn % 32; unsigned int n = 32 - blkn % 32; - unsigned char l_tmp[16]; void *Ls[32]; void **l; size_t i; @@ -364,9 +363,8 @@ _gcry_aes_armv8_ce_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, /* Process data in 32 block chunks. */ while (nblocks >= 32) { - /* l_tmp will be used only every 65536-th block. */ blkn_offs += 32; - *l = (void *)ocb_get_l(c, l_tmp, blkn_offs); + *l = (void *)ocb_get_l(c, blkn_offs); crypt_fn(keysched, outbuf, inbuf, c->u_iv.iv, c->u_ctr.ctr, Ls, 32, nrounds); @@ -378,13 +376,13 @@ _gcry_aes_armv8_ce_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, if (nblocks && l < &Ls[nblocks]) { - *l = (void *)ocb_get_l(c, l_tmp, 32 + blkn_offs); + *l = (void *)ocb_get_l(c, 32 + blkn_offs); } } else { for (i = 0; i < nblocks; i++) - Ls[i] = (void *)ocb_get_l(c, l_tmp, ++blkn); + Ls[i] = (void *)ocb_get_l(c, ++blkn); } if (nblocks) @@ -392,8 +390,6 @@ _gcry_aes_armv8_ce_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, crypt_fn(keysched, outbuf, inbuf, c->u_iv.iv, c->u_ctr.ctr, Ls, nblocks, nrounds); } - - wipememory(&l_tmp, sizeof(l_tmp)); } void @@ -407,7 +403,6 @@ _gcry_aes_armv8_ce_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, u64 blkn = c->u_mode.ocb.aad_nblocks; u64 blkn_offs = blkn - blkn % 32; unsigned int n = 32 - blkn % 32; - unsigned char l_tmp[16]; void *Ls[32]; void **l; size_t i; @@ -435,9 +430,8 @@ _gcry_aes_armv8_ce_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, /* Process data in 32 block chunks. */ while (nblocks >= 32) { - /* l_tmp will be used only every 65536-th block. */ blkn_offs += 32; - *l = (void *)ocb_get_l(c, l_tmp, blkn_offs); + *l = (void *)ocb_get_l(c, blkn_offs); _gcry_aes_ocb_auth_armv8_ce(keysched, abuf, c->u_mode.ocb.aad_offset, c->u_mode.ocb.aad_sum, Ls, 32, nrounds); @@ -448,13 +442,13 @@ _gcry_aes_armv8_ce_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, if (nblocks && l < &Ls[nblocks]) { - *l = (void *)ocb_get_l(c, l_tmp, 32 + blkn_offs); + *l = (void *)ocb_get_l(c, 32 + blkn_offs); } } else { for (i = 0; i < nblocks; i++) - Ls[i] = (void *)ocb_get_l(c, l_tmp, ++blkn); + Ls[i] = (void *)ocb_get_l(c, ++blkn); } if (nblocks) @@ -462,8 +456,6 @@ _gcry_aes_armv8_ce_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, _gcry_aes_ocb_auth_armv8_ce(keysched, abuf, c->u_mode.ocb.aad_offset, c->u_mode.ocb.aad_sum, Ls, nblocks, nrounds); } - - wipememory(&l_tmp, sizeof(l_tmp)); } #endif /* USE_ARM_CE */ diff --git a/cipher/rijndael-ssse3-amd64.c b/cipher/rijndael-ssse3-amd64.c index 937d868..a8e89d4 100644 --- a/cipher/rijndael-ssse3-amd64.c +++ b/cipher/rijndael-ssse3-amd64.c @@ -527,92 +527,10 @@ _gcry_aes_ssse3_cbc_dec (RIJNDAEL_context *ctx, unsigned char *outbuf, } -static inline const unsigned char * -get_l (gcry_cipher_hd_t c, unsigned char *l_tmp, u64 i, unsigned char *iv, - unsigned char *ctr, const void **aes_const_ptr, - byte ssse3_state[SSSE3_STATE_SIZE], int encrypt) -{ - const unsigned char *l; - unsigned int ntz; - - if (i & 1) - return c->u_mode.ocb.L[0]; - else if (i & 2) - return c->u_mode.ocb.L[1]; - else if (i & 0xffffffffU) - { - asm ("rep;bsf %k[low], %k[ntz]\n\t" - : [ntz] "=r" (ntz) - : [low] "r" (i & 0xffffffffU) - : "cc"); - } - else - { - if (OCB_L_TABLE_SIZE < 32) - { - ntz = 32; - } - else if (i) - { - asm ("rep;bsf %k[high], %k[ntz]\n\t" - : [ntz] "=r" (ntz) - : [high] "r" (i >> 32) - : "cc"); - ntz += 32; - } - else - { - ntz = 64; - } - } - - if (ntz < OCB_L_TABLE_SIZE) - { - l = c->u_mode.ocb.L[ntz]; - } - else - { - /* Store Offset & Checksum before calling external function */ - asm volatile ("movdqu %%xmm7, %[iv]\n\t" - "movdqu %%xmm6, %[ctr]\n\t" - : [iv] "=m" (*iv), - [ctr] "=m" (*ctr) - : - : "memory" ); - - /* Restore SSSE3 state. */ - vpaes_ssse3_cleanup(); - - l = _gcry_cipher_ocb_get_l (c, l_tmp, i); - - /* Save SSSE3 state. */ - if (encrypt) - { - vpaes_ssse3_prepare_enc (*aes_const_ptr); - } - else - { - vpaes_ssse3_prepare_dec (*aes_const_ptr); - } - - /* Restore Offset & Checksum */ - asm volatile ("movdqu %[iv], %%xmm7\n\t" - "movdqu %[ctr], %%xmm6\n\t" - : /* No output */ - : [iv] "m" (*iv), - [ctr] "m" (*ctr) - : "memory" ); - } - - return l; -} - - static void ssse3_ocb_enc (gcry_cipher_hd_t c, void *outbuf_arg, const void *inbuf_arg, size_t nblocks) { - union { unsigned char x1[16] ATTR_ALIGNED_16; u32 x32[4]; } l_tmp; RIJNDAEL_context *ctx = (void *)&c->context.c; unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; @@ -635,8 +553,7 @@ ssse3_ocb_enc (gcry_cipher_hd_t c, void *outbuf_arg, { const unsigned char *l; - l = get_l(c, l_tmp.x1, ++n, c->u_iv.iv, c->u_ctr.ctr, &aes_const_ptr, - ssse3_state, 1); + l = ocb_get_l(c, ++n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* Checksum_i = Checksum_{i-1} xor P_i */ @@ -671,7 +588,6 @@ ssse3_ocb_enc (gcry_cipher_hd_t c, void *outbuf_arg, : : "memory" ); - wipememory(&l_tmp, sizeof(l_tmp)); vpaes_ssse3_cleanup (); } @@ -679,7 +595,6 @@ static void ssse3_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, const void *inbuf_arg, size_t nblocks) { - union { unsigned char x1[16] ATTR_ALIGNED_16; u32 x32[4]; } l_tmp; RIJNDAEL_context *ctx = (void *)&c->context.c; unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; @@ -702,8 +617,7 @@ ssse3_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, { const unsigned char *l; - l = get_l(c, l_tmp.x1, ++n, c->u_iv.iv, c->u_ctr.ctr, &aes_const_ptr, - ssse3_state, 0); + l = ocb_get_l(c, ++n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* P_i = Offset_i xor DECIPHER(K, C_i xor Offset_i) */ @@ -738,7 +652,6 @@ ssse3_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, : : "memory" ); - wipememory(&l_tmp, sizeof(l_tmp)); vpaes_ssse3_cleanup (); } @@ -758,7 +671,6 @@ void _gcry_aes_ssse3_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, size_t nblocks) { - union { unsigned char x1[16] ATTR_ALIGNED_16; u32 x32[4]; } l_tmp; RIJNDAEL_context *ctx = (void *)&c->context.c; const unsigned char *abuf = abuf_arg; u64 n = c->u_mode.ocb.aad_nblocks; @@ -780,8 +692,7 @@ _gcry_aes_ssse3_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, { const unsigned char *l; - l = get_l(c, l_tmp.x1, ++n, c->u_mode.ocb.aad_offset, - c->u_mode.ocb.aad_sum, &aes_const_ptr, ssse3_state, 1); + l = ocb_get_l(c, ++n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ @@ -812,7 +723,6 @@ _gcry_aes_ssse3_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, : : "memory" ); - wipememory(&l_tmp, sizeof(l_tmp)); vpaes_ssse3_cleanup (); } diff --git a/cipher/rijndael.c b/cipher/rijndael.c index cc6a722..66ea0f3 100644 --- a/cipher/rijndael.c +++ b/cipher/rijndael.c @@ -1353,7 +1353,7 @@ _gcry_aes_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, for ( ;nblocks; nblocks-- ) { u64 i = ++c->u_mode.ocb.data_nblocks; - const unsigned char *l = ocb_get_l(c, l_tmp.x1, i); + const unsigned char *l = ocb_get_l(c, i); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ buf_xor_1 (c->u_iv.iv, l, BLOCKSIZE); @@ -1378,7 +1378,7 @@ _gcry_aes_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, for ( ;nblocks; nblocks-- ) { u64 i = ++c->u_mode.ocb.data_nblocks; - const unsigned char *l = ocb_get_l(c, l_tmp.x1, i); + const unsigned char *l = ocb_get_l(c, i); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ buf_xor_1 (c->u_iv.iv, l, BLOCKSIZE); @@ -1445,7 +1445,7 @@ _gcry_aes_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, size_t nblocks) for ( ;nblocks; nblocks-- ) { u64 i = ++c->u_mode.ocb.aad_nblocks; - const unsigned char *l = ocb_get_l(c, l_tmp.x1, i); + const unsigned char *l = ocb_get_l(c, i); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ buf_xor_1 (c->u_mode.ocb.aad_offset, l, BLOCKSIZE); diff --git a/cipher/serpent.c b/cipher/serpent.c index ef19d3b..ea4b8ed 100644 --- a/cipher/serpent.c +++ b/cipher/serpent.c @@ -1235,7 +1235,6 @@ _gcry_serpent_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, serpent_context_t *ctx = (void *)&c->context.c; unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; - unsigned char l_tmp[sizeof(serpent_block_t)]; int burn_stack_depth = 2 * sizeof (serpent_block_t); u64 blkn = c->u_mode.ocb.data_nblocks; #else @@ -1275,9 +1274,8 @@ _gcry_serpent_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, /* Process data in 16 block chunks. */ while (nblocks >= 16) { - /* l_tmp will be used only every 65536-th block. */ blkn += 16; - *l = (uintptr_t)(void *)ocb_get_l(c, l_tmp, blkn - blkn % 16); + *l = (uintptr_t)(void *)ocb_get_l(c, blkn - blkn % 16); if (encrypt) _gcry_serpent_avx2_ocb_enc(ctx, outbuf, inbuf, c->u_iv.iv, @@ -1327,9 +1325,8 @@ _gcry_serpent_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, /* Process data in 8 block chunks. */ while (nblocks >= 8) { - /* l_tmp will be used only every 65536-th block. */ blkn += 8; - *l = (uintptr_t)(void *)ocb_get_l(c, l_tmp, blkn - blkn % 8); + *l = (uintptr_t)(void *)ocb_get_l(c, blkn - blkn % 8); if (encrypt) _gcry_serpent_sse2_ocb_enc(ctx, outbuf, inbuf, c->u_iv.iv, @@ -1378,9 +1375,8 @@ _gcry_serpent_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, /* Process data in 8 block chunks. */ while (nblocks >= 8) { - /* l_tmp will be used only every 65536-th block. */ blkn += 8; - *l = ocb_get_l(c, l_tmp, blkn - blkn % 8); + *l = ocb_get_l(c, blkn - blkn % 8); if (encrypt) _gcry_serpent_neon_ocb_enc(ctx, outbuf, inbuf, c->u_iv.iv, @@ -1410,8 +1406,6 @@ _gcry_serpent_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, #if defined(USE_AVX2) || defined(USE_SSE2) || defined(USE_NEON) c->u_mode.ocb.data_nblocks = blkn; - wipememory(&l_tmp, sizeof(l_tmp)); - if (burn_stack_depth) _gcry_burn_stack (burn_stack_depth + 4 * sizeof(void *)); #endif @@ -1427,7 +1421,6 @@ _gcry_serpent_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, #if defined(USE_AVX2) || defined(USE_SSE2) || defined(USE_NEON) serpent_context_t *ctx = (void *)&c->context.c; const unsigned char *abuf = abuf_arg; - unsigned char l_tmp[sizeof(serpent_block_t)]; int burn_stack_depth = 2 * sizeof(serpent_block_t); u64 blkn = c->u_mode.ocb.aad_nblocks; #else @@ -1465,9 +1458,8 @@ _gcry_serpent_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, /* Process data in 16 block chunks. */ while (nblocks >= 16) { - /* l_tmp will be used only every 65536-th block. */ blkn += 16; - *l = (uintptr_t)(void *)ocb_get_l(c, l_tmp, blkn - blkn % 16); + *l = (uintptr_t)(void *)ocb_get_l(c, blkn - blkn % 16); _gcry_serpent_avx2_ocb_auth(ctx, abuf, c->u_mode.ocb.aad_offset, c->u_mode.ocb.aad_sum, Ls); @@ -1512,9 +1504,8 @@ _gcry_serpent_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, /* Process data in 8 block chunks. */ while (nblocks >= 8) { - /* l_tmp will be used only every 65536-th block. */ blkn += 8; - *l = (uintptr_t)(void *)ocb_get_l(c, l_tmp, blkn - blkn % 8); + *l = (uintptr_t)(void *)ocb_get_l(c, blkn - blkn % 8); _gcry_serpent_sse2_ocb_auth(ctx, abuf, c->u_mode.ocb.aad_offset, c->u_mode.ocb.aad_sum, Ls); @@ -1558,9 +1549,8 @@ _gcry_serpent_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, /* Process data in 8 block chunks. */ while (nblocks >= 8) { - /* l_tmp will be used only every 65536-th block. */ blkn += 8; - *l = ocb_get_l(c, l_tmp, blkn - blkn % 8); + *l = ocb_get_l(c, blkn - blkn % 8); _gcry_serpent_neon_ocb_auth(ctx, abuf, c->u_mode.ocb.aad_offset, c->u_mode.ocb.aad_sum, Ls); @@ -1585,8 +1575,6 @@ _gcry_serpent_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, #if defined(USE_AVX2) || defined(USE_SSE2) || defined(USE_NEON) c->u_mode.ocb.aad_nblocks = blkn; - wipememory(&l_tmp, sizeof(l_tmp)); - if (burn_stack_depth) _gcry_burn_stack (burn_stack_depth + 4 * sizeof(void *)); #endif diff --git a/cipher/twofish.c b/cipher/twofish.c index 7a4d26a..55f6fb9 100644 --- a/cipher/twofish.c +++ b/cipher/twofish.c @@ -1261,7 +1261,6 @@ _gcry_twofish_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, TWOFISH_context *ctx = (void *)&c->context.c; unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; - unsigned char l_tmp[TWOFISH_BLOCKSIZE]; unsigned int burn, burn_stack_depth = 0; u64 blkn = c->u_mode.ocb.data_nblocks; @@ -1273,10 +1272,9 @@ _gcry_twofish_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, /* Process data in 3 block chunks. */ while (nblocks >= 3) { - /* l_tmp will be used only every 65536-th block. */ - Ls[0] = (uintptr_t)(const void *)ocb_get_l(c, l_tmp, blkn + 1); - Ls[1] = (uintptr_t)(const void *)ocb_get_l(c, l_tmp, blkn + 2); - Ls[2] = (uintptr_t)(const void *)ocb_get_l(c, l_tmp, blkn + 3); + Ls[0] = (uintptr_t)(const void *)ocb_get_l(c, blkn + 1); + Ls[1] = (uintptr_t)(const void *)ocb_get_l(c, blkn + 2); + Ls[2] = (uintptr_t)(const void *)ocb_get_l(c, blkn + 3); blkn += 3; if (encrypt) @@ -1300,8 +1298,6 @@ _gcry_twofish_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, c->u_mode.ocb.data_nblocks = blkn; - wipememory(&l_tmp, sizeof(l_tmp)); - if (burn_stack_depth) _gcry_burn_stack (burn_stack_depth + 4 * sizeof(void *)); #else @@ -1322,7 +1318,6 @@ _gcry_twofish_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, #ifdef USE_AMD64_ASM TWOFISH_context *ctx = (void *)&c->context.c; const unsigned char *abuf = abuf_arg; - unsigned char l_tmp[TWOFISH_BLOCKSIZE]; unsigned int burn, burn_stack_depth = 0; u64 blkn = c->u_mode.ocb.aad_nblocks; @@ -1334,10 +1329,9 @@ _gcry_twofish_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, /* Process data in 3 block chunks. */ while (nblocks >= 3) { - /* l_tmp will be used only every 65536-th block. */ - Ls[0] = (uintptr_t)(const void *)ocb_get_l(c, l_tmp, blkn + 1); - Ls[1] = (uintptr_t)(const void *)ocb_get_l(c, l_tmp, blkn + 2); - Ls[2] = (uintptr_t)(const void *)ocb_get_l(c, l_tmp, blkn + 3); + Ls[0] = (uintptr_t)(const void *)ocb_get_l(c, blkn + 1); + Ls[1] = (uintptr_t)(const void *)ocb_get_l(c, blkn + 2); + Ls[2] = (uintptr_t)(const void *)ocb_get_l(c, blkn + 3); blkn += 3; twofish_amd64_ocb_auth(ctx, abuf, c->u_mode.ocb.aad_offset, @@ -1356,8 +1350,6 @@ _gcry_twofish_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, c->u_mode.ocb.aad_nblocks = blkn; - wipememory(&l_tmp, sizeof(l_tmp)); - if (burn_stack_depth) _gcry_burn_stack (burn_stack_depth + 4 * sizeof(void *)); #else From smueller at chronox.de Mon Dec 5 21:28:07 2016 From: smueller at chronox.de (Stephan Mueller) Date: Mon, 05 Dec 2016 21:28:07 +0100 Subject: [PATCH v2] random-drbg: use bufhelp function for big-endian store In-Reply-To: <148094909862.18418.13534879577959633743.stgit@localhost6.localdomain6> References: <1761763.9MeJz0GvvL@tauon.atsec.com> <148094909862.18418.13534879577959633743.stgit@localhost6.localdomain6> Message-ID: <1662317.6SlRtxQeMN@positron.chronox.de> Am Montag, 5. Dezember 2016, 16:44:58 CET schrieb Jussi Kivilinna: Hi Jussi, > * random/random-drbg.c (drbg_cpu_to_be32): Remove. > (drbg_ctr_df, drbg_hash_df): Use 'buf_put_be32' instead of > 'drbg_cpu_to_be32'. > -- > > Signed-off-by: Jussi Kivilinna Tested-by: Stephan Mueller Ciao Stephan From el11151 at mail.ntua.gr Tue Dec 6 20:40:12 2016 From: el11151 at mail.ntua.gr (Kostis Andrikopoulos) Date: Tue, 6 Dec 2016 21:40:12 +0200 Subject: mpi_swap_cond: different sizes error on eddsa key generation In-Reply-To: References: <06c25709-dfa9-a965-bd6f-50da51cd2d59@fsij.org> <23536e43-7bf9-d801-2e26-26b8dd9c49ee@mail.ntua.gr> Message-ID: <7ac506f8-b546-4c62-a830-52bcda73b51e@mail.ntua.gr> On 11/26/2016 01:11 AM, NIIBE Yutaka wrote: > Or..., could you give me information on: > Is the call sequence following? > > ... > -> ecc_generate > -> _gcry_ecc_eddsa_genkey > -> _gcry_mpi_ec_mul_point > -> point_swap_cond > -> mpi_swap_cond > -> log_bug > Hello again, we run the application in gdb and got the following stack trace: https://foss.ntua.gr/paste/?09531d0b88a9cd2c#0JMzhgYKcSPzQUtCKmGpUEs1ahW2NtJmye/zb+T7dKo= The call sequence seems to be the one you suggested. -- ????????????? ???????????? 031 11 151 ????? From cvs at cvs.gnupg.org Tue Dec 6 21:24:48 2016 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Tue, 06 Dec 2016 21:24:48 +0100 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.7.3-21-g603f479 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 603f479a919311f720a05da738150c2192d5e562 (commit) from a0580d446fef648a177ca4ab060d0e449780db84 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 603f479a919311f720a05da738150c2192d5e562 Author: Werner Koch Date: Tue Dec 6 21:20:54 2016 +0100 Reorganize code in secmem.c. * src/secmem.c (pooldesc_t): New type to collect information about one pool. (pool_size): Remove. Now a member of pooldesc_t. (pool_okay): Ditto. (pool_is_mmapped): Ditto. (pool): Rename variable ... (mainpool): And change type to pooldesc_t. (ptr_into_pool_p): Add arg 'pool'. (mb_get_next): Ditto. (mb_get_prev): Ditto. (mb_merge): Ditto. (mb_get_new): Ditto. (init_pool): Ditto. (lock_pool): Rename to ... (look_pool_pages: this. (secmem_init): Rename to ... (_gcry_secmem_init_internal): this. Add local var POOL and init with address of MAINPOOL. (_gcry_secmem_malloc_internal): Add local var POOL and init with address of MAINPOOL. (_gcry_private_is_secure): Ditto. (_gcry_secmem_term): Ditto. (_gcry_secmem_dump_stats): Ditto. (_gcry_secmem_free_internal): Ditto. Remove check for NULL arg. (_gcry_secmem_free): Add check for NULL arg before taking the lock. (_gcry_secmem_realloc): Factor most code out to ... (_gcry_secmem_realloc_internal): this. -- This change prepares future work to allow the use of several pools. Signed-off-by: Werner Koch diff --git a/src/secmem.c b/src/secmem.c index c4e8414..1f92f17 100644 --- a/src/secmem.c +++ b/src/secmem.c @@ -1,7 +1,7 @@ /* secmem.c - memory allocation from a secure heap * Copyright (C) 1998, 1999, 2000, 2001, 2002, * 2003, 2007 Free Software Foundation, Inc. - * Copyright (C) 2013 g10 Code GmbH + * Copyright (C) 2013, 2016 g10 Code GmbH * * This file is part of Libgcrypt. * @@ -59,20 +59,30 @@ typedef struct memblock /* This flag specifies that the memory block is in use. */ #define MB_FLAG_ACTIVE (1 << 0) -/* The pool of secure memory. */ -static void *pool; +/* An object describing a memory pool. */ +typedef struct pooldesc_s +{ + /* A memory buffer used as allocation pool. */ + void *mem; + + /* The allocated size of MEM. */ + size_t size; + + /* Flag indicating that this memory pool is ready for use. May be + * checked in an atexit function. */ + volatile int okay; -/* Size of POOL in bytes. */ -static size_t pool_size; + /* Flag indicating whether MEM is mmapped. */ + volatile int is_mmapped; -/* True, if the memory pool is ready for use. May be checked in an - atexit function. */ -static volatile int pool_okay; +} pooldesc_t; -/* True, if the memory pool is mmapped. */ -static volatile int pool_is_mmapped; -/* FIXME? */ +/* The pool of secure memory. */ +static pooldesc_t mainpool; + + +/* A couple of flags whith some beeing set early. */ static int disable_secmem; static int show_warning; static int not_locked; @@ -84,7 +94,7 @@ static int no_priv_drop; /* Stats. */ static unsigned int cur_alloced, cur_blocks; -/* Lock protecting accesses to the memory pool. */ +/* Lock protecting accesses to the memory pools. */ GPGRT_LOCK_DEFINE (secmem_lock); /* Convenient macros. */ @@ -100,18 +110,18 @@ GPGRT_LOCK_DEFINE (secmem_lock); #define ADDR_TO_BLOCK(addr) \ (memblock_t *) (void *) ((char *) addr - BLOCK_HEAD_SIZE) -/* Check whether P points into the pool. */ +/* Check whether P points into POOL. */ static int -ptr_into_pool_p (const void *p) +ptr_into_pool_p (pooldesc_t *pool, const void *p) { /* We need to convert pointers to addresses. This is required by C-99 6.5.8 to avoid undefined behaviour. See also http://lists.gnupg.org/pipermail/gcrypt-devel/2007-February/001102.html */ uintptr_t p_addr = (uintptr_t)p; - uintptr_t pool_addr = (uintptr_t)pool; + uintptr_t pool_addr = (uintptr_t)pool->mem; - return p_addr >= pool_addr && p_addr < pool_addr + pool_size; + return p_addr >= pool_addr && p_addr < pool_addr + pool->size; } /* Update the stats. */ @@ -132,13 +142,13 @@ stats_update (size_t add, size_t sub) /* Return the block following MB or NULL, if MB is the last block. */ static memblock_t * -mb_get_next (memblock_t *mb) +mb_get_next (pooldesc_t *pool, memblock_t *mb) { memblock_t *mb_next; mb_next = (memblock_t *) (void *) ((char *) mb + BLOCK_HEAD_SIZE + mb->size); - if (! ptr_into_pool_p (mb_next)) + if (! ptr_into_pool_p (pool, mb_next)) mb_next = NULL; return mb_next; @@ -147,18 +157,18 @@ mb_get_next (memblock_t *mb) /* Return the block preceding MB or NULL, if MB is the first block. */ static memblock_t * -mb_get_prev (memblock_t *mb) +mb_get_prev (pooldesc_t *pool, memblock_t *mb) { memblock_t *mb_prev, *mb_next; - if (mb == pool) + if (mb == pool->mem) mb_prev = NULL; else { - mb_prev = (memblock_t *) pool; + mb_prev = (memblock_t *) pool->mem; while (1) { - mb_next = mb_get_next (mb_prev); + mb_next = mb_get_next (pool, mb_prev); if (mb_next == mb) break; else @@ -172,12 +182,12 @@ mb_get_prev (memblock_t *mb) /* If the preceding block of MB and/or the following block of MB exist and are not active, merge them to form a bigger block. */ static void -mb_merge (memblock_t *mb) +mb_merge (pooldesc_t *pool, memblock_t *mb) { memblock_t *mb_prev, *mb_next; - mb_prev = mb_get_prev (mb); - mb_next = mb_get_next (mb); + mb_prev = mb_get_prev (pool, mb); + mb_next = mb_get_next (pool, mb); if (mb_prev && (! (mb_prev->flags & MB_FLAG_ACTIVE))) { @@ -190,11 +200,11 @@ mb_merge (memblock_t *mb) /* Return a new block, which can hold SIZE bytes. */ static memblock_t * -mb_get_new (memblock_t *block, size_t size) +mb_get_new (pooldesc_t *pool, memblock_t *block, size_t size) { memblock_t *mb, *mb_split; - for (mb = block; ptr_into_pool_p (mb); mb = mb_get_next (mb)) + for (mb = block; ptr_into_pool_p (pool, mb); mb = mb_get_next (pool, mb)) if (! (mb->flags & MB_FLAG_ACTIVE) && mb->size >= size) { /* Found a free block. */ @@ -211,14 +221,14 @@ mb_get_new (memblock_t *block, size_t size) mb->size = size; - mb_merge (mb_split); + mb_merge (pool, mb_split); } break; } - if (! ptr_into_pool_p (mb)) + if (! ptr_into_pool_p (pool, mb)) { gpg_err_set_errno (ENOMEM); mb = NULL; @@ -235,9 +245,11 @@ print_warn (void) log_info (_("Warning: using insecure memory!\n")); } -/* Lock the memory pages into core and drop privileges. */ + +/* Lock the memory pages of pool P of size N into core and drop + * privileges. */ static void -lock_pool (void *p, size_t n) +lock_pool_pages (void *p, size_t n) { #if defined(USE_CAPABILITIES) && defined(HAVE_MLOCK) int err; @@ -367,11 +379,11 @@ lock_pool (void *p, size_t n) /* Initialize POOL. */ static void -init_pool (size_t n) +init_pool (pooldesc_t *pool, size_t n) { memblock_t *mb; - pool_size = n; + pool->size = n; if (disable_secmem) log_bug ("secure memory is disabled"); @@ -391,10 +403,10 @@ init_pool (size_t n) # endif pgsize = (pgsize_val != -1 && pgsize_val > 0)? pgsize_val:DEFAULT_PAGE_SIZE; - pool_size = (pool_size + pgsize - 1) & ~(pgsize - 1); + pool->size = (pool->size + pgsize - 1) & ~(pgsize - 1); # ifdef MAP_ANONYMOUS - pool = mmap (0, pool_size, PROT_READ | PROT_WRITE, - MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + pool->mem = mmap (0, pool->size, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); # else /* map /dev/zero instead */ { int fd; @@ -403,40 +415,40 @@ init_pool (size_t n) if (fd == -1) { log_error ("can't open /dev/zero: %s\n", strerror (errno)); - pool = (void *) -1; + pool->mem = (void *) -1; } else { - pool = mmap (0, pool_size, - (PROT_READ | PROT_WRITE), MAP_PRIVATE, fd, 0); + pool->mem = mmap (0, pool->size, + (PROT_READ | PROT_WRITE), MAP_PRIVATE, fd, 0); close (fd); } } # endif - if (pool == (void *) -1) + if (pool->mem == (void *) -1) log_info ("can't mmap pool of %u bytes: %s - using malloc\n", - (unsigned) pool_size, strerror (errno)); + (unsigned) pool->size, strerror (errno)); else { - pool_is_mmapped = 1; - pool_okay = 1; + pool->is_mmapped = 1; + pool->okay = 1; } } #endif /*HAVE_MMAP*/ - if (!pool_okay) + if (!pool->okay) { - pool = malloc (pool_size); - if (!pool) + pool->mem = malloc (pool->size); + if (!pool->mem) log_fatal ("can't allocate memory pool of %u bytes\n", - (unsigned) pool_size); + (unsigned) pool->size); else - pool_okay = 1; + pool->okay = 1; } /* Initialize first memory block. */ - mb = (memblock_t *) pool; - mb->size = pool_size; + mb = (memblock_t *) pool->mem; + mb->size = pool->size; mb->flags = 0; } @@ -482,11 +494,14 @@ _gcry_secmem_get_flags (void) } -/* See _gcry_secmem_init. This function is expected to be called with - the secmem lock held. */ +/* This function initializes the main memory pool MAINPOOL. Itis + * expected to be called with the secmem lock held. */ static void -secmem_init (size_t n) +_gcry_secmem_init_internal (size_t n) { + pooldesc_t *pool; + + pool = &mainpool; if (!n) { #ifdef USE_CAPABILITIES @@ -516,10 +531,10 @@ secmem_init (size_t n) { if (n < MINIMUM_POOL_SIZE) n = MINIMUM_POOL_SIZE; - if (! pool_okay) + if (! pool->okay) { - init_pool (n); - lock_pool (pool, n); + init_pool (pool, n); + lock_pool_pages (pool->mem, n); } else log_error ("Oops, secure memory pool already initialized\n"); @@ -537,7 +552,7 @@ _gcry_secmem_init (size_t n) { SECMEM_LOCK; - secmem_init (n); + _gcry_secmem_init_internal (n); SECMEM_UNLOCK; } @@ -554,13 +569,16 @@ _gcry_secmem_module_init () static void * _gcry_secmem_malloc_internal (size_t size) { + pooldesc_t *pool; memblock_t *mb; - if (!pool_okay) + pool = &mainpool; + + if (!pool->okay) { /* Try to initialize the pool if the user forgot about it. */ - secmem_init (STANDARD_POOL_SIZE); - if (!pool_okay) + _gcry_secmem_init_internal (STANDARD_POOL_SIZE); + if (!pool->okay) { log_info (_("operation is not possible without " "initialized secure memory\n")); @@ -583,7 +601,7 @@ _gcry_secmem_malloc_internal (size_t size) /* Blocks are always a multiple of 32. */ size = ((size + 31) / 32) * 32; - mb = mb_get_new ((memblock_t *) pool, size); + mb = mb_get_new (pool, (memblock_t *) pool->mem, size); if (mb) stats_update (size, 0); @@ -605,11 +623,11 @@ _gcry_secmem_malloc (size_t size) static void _gcry_secmem_free_internal (void *a) { + pooldesc_t *pool; memblock_t *mb; int size; - if (!a) - return; + pool = &mainpool; mb = ADDR_TO_BLOCK (a); size = mb->size; @@ -624,34 +642,35 @@ _gcry_secmem_free_internal (void *a) MB_WIPE_OUT (0x55); MB_WIPE_OUT (0x00); + /* Update stats. */ stats_update (0, size); mb->flags &= ~MB_FLAG_ACTIVE; - /* Update stats. */ - mb_merge (mb); + mb_merge (pool, mb); } /* Wipe out and release memory. */ void _gcry_secmem_free (void *a) { + if (!a) + return; + SECMEM_LOCK; _gcry_secmem_free_internal (a); SECMEM_UNLOCK; } -/* Realloc memory. */ -void * -_gcry_secmem_realloc (void *p, size_t newsize) + +static void * +_gcry_secmem_realloc_internal (void *p, size_t newsize) { memblock_t *mb; size_t size; void *a; - SECMEM_LOCK; - mb = (memblock_t *) (void *) ((char *) p - ((size_t) &((memblock_t *) 0)->aligned.c)); size = mb->size; @@ -671,6 +690,18 @@ _gcry_secmem_realloc (void *p, size_t newsize) } } + return a; +} + + +/* Realloc memory. */ +void * +_gcry_secmem_realloc (void *p, size_t newsize) +{ + void *a; + + SECMEM_LOCK; + a = _gcry_secmem_realloc_internal (p, newsize); SECMEM_UNLOCK; return a; @@ -681,7 +712,10 @@ _gcry_secmem_realloc (void *p, size_t newsize) int _gcry_private_is_secure (const void *p) { - return pool_okay && ptr_into_pool_p (p); + pooldesc_t *pool; + + pool = &mainpool; + return pool->okay && ptr_into_pool_p (pool, p); } @@ -696,20 +730,23 @@ _gcry_private_is_secure (const void *p) void _gcry_secmem_term () { - if (!pool_okay) + pooldesc_t *pool; + + pool = &mainpool; + if (!pool->okay) return; - wipememory2 (pool, 0xff, pool_size); - wipememory2 (pool, 0xaa, pool_size); - wipememory2 (pool, 0x55, pool_size); - wipememory2 (pool, 0x00, pool_size); + wipememory2 (pool->mem, 0xff, pool->size); + wipememory2 (pool->mem, 0xaa, pool->size); + wipememory2 (pool->mem, 0x55, pool->size); + wipememory2 (pool->mem, 0x00, pool->size); #if HAVE_MMAP - if (pool_is_mmapped) - munmap (pool, pool_size); + if (pool->is_mmapped) + munmap (pool->mem, pool->size); #endif - pool = NULL; - pool_okay = 0; - pool_size = 0; + pool->mem = NULL; + pool->okay = 0; + pool->size = 0; not_locked = 0; } @@ -717,12 +754,15 @@ _gcry_secmem_term () void _gcry_secmem_dump_stats () { + pooldesc_t *pool; + #if 1 SECMEM_LOCK; - if (pool_okay) + pool = &mainpool; + if (pool->okay) log_info ("secmem usage: %u/%lu bytes in %u blocks\n", - cur_alloced, (unsigned long)pool_size, cur_blocks); + cur_alloced, (unsigned long)pool->size, cur_blocks); SECMEM_UNLOCK; #else memblock_t *mb; @@ -730,9 +770,10 @@ _gcry_secmem_dump_stats () SECMEM_LOCK; - for (i = 0, mb = (memblock_t *) pool; - ptr_into_pool_p (mb); - mb = mb_get_next (mb), i++) + pool = &mainpool; + for (i = 0, mb = (memblock_t *) pool->mem; + ptr_into_pool_p (pool, mb); + mb = mb_get_next (pool, mb), i++) log_info ("SECMEM: [%s] block: %i; size: %i\n", (mb->flags & MB_FLAG_ACTIVE) ? "used" : "free", i, ----------------------------------------------------------------------- Summary of changes: src/secmem.c | 213 +++++++++++++++++++++++++++++++++++------------------------ 1 file changed, 127 insertions(+), 86 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Tue Dec 6 21:51:16 2016 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Tue, 06 Dec 2016 21:51:16 +0100 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.7.3-22-g984a97f Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 984a97f0750f812f0ad3c343ee6a67560953a504 (commit) from 603f479a919311f720a05da738150c2192d5e562 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 984a97f0750f812f0ad3c343ee6a67560953a504 Author: Werner Koch Date: Tue Dec 6 21:44:33 2016 +0100 Fix possible NULL-deref in gcry_log_debugsxp * src/misc.c (_gcry_log_printsxp): Prevent passing NULL to strlen. Signed-off-by: Werner Koch diff --git a/src/misc.c b/src/misc.c index 413d7d8..9d8b7bd 100644 --- a/src/misc.c +++ b/src/misc.c @@ -406,7 +406,7 @@ _gcry_log_printsxp (const char *text, gcry_sexp_t sexp) do { if (any && !with_lf) - log_debug ("%*s ", (int)strlen(text), ""); + log_debug ("%*s ", text?(int)strlen(text):0, ""); else any = 1; pend = strchr (p, '\n'); ----------------------------------------------------------------------- Summary of changes: src/misc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Tue Dec 6 22:21:41 2016 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Tue, 06 Dec 2016 22:21:41 +0100 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.7.3-23-g995ce69 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 995ce697308320c6a52a307f83dc49eeb8d784b4 (commit) from 984a97f0750f812f0ad3c343ee6a67560953a504 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 995ce697308320c6a52a307f83dc49eeb8d784b4 Author: Werner Koch Date: Tue Dec 6 22:19:04 2016 +0100 Fix compiler warning about possible-NULL-dreference * src/mpi.h (mpi_is_const, mpi_is_immutable): Do check arg before deref-ing. The are only used at places where the arg shall not be NULL. -- This was designed as a general purpose macro and written in a defensive way. However, if it a NULL would be passed to that macro code run in the else branch will deref the arg anyway. Signed-off-by: Werner Koch diff --git a/src/mpi.h b/src/mpi.h index cd539f5..b5385b5 100644 --- a/src/mpi.h +++ b/src/mpi.h @@ -109,8 +109,8 @@ struct gcry_mpi void _gcry_mpi_immutable_failed (void); #define mpi_immutable_failed() _gcry_mpi_immutable_failed () -#define mpi_is_const(a) ((a) && ((a)->flags&32)) -#define mpi_is_immutable(a) ((a) && ((a)->flags&16)) +#define mpi_is_const(a) ((a)->flags&32) +#define mpi_is_immutable(a) ((a)->flags&16) #define mpi_is_opaque(a) ((a) && ((a)->flags&4)) #define mpi_is_secure(a) ((a) && ((a)->flags&1)) #define mpi_clear(a) _gcry_mpi_clear ((a)) ----------------------------------------------------------------------- Summary of changes: src/mpi.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From gniibe at fsij.org Wed Dec 7 02:36:16 2016 From: gniibe at fsij.org (NIIBE Yutaka) Date: Wed, 07 Dec 2016 10:36:16 +0900 Subject: mpi_swap_cond: different sizes error on eddsa key generation In-Reply-To: <7ac506f8-b546-4c62-a830-52bcda73b51e@mail.ntua.gr> References: <06c25709-dfa9-a965-bd6f-50da51cd2d59@fsij.org> <23536e43-7bf9-d801-2e26-26b8dd9c49ee@mail.ntua.gr> <7ac506f8-b546-4c62-a830-52bcda73b51e@mail.ntua.gr> Message-ID: <87twag34hb.fsf@iwagami.gniibe.org> Hello again, I have been looking code closely so that I can find the possible cause, but I couldn't so far (what I see is the code of libgcrypt in master branch and LIBGCRYPT-1-7-BRANCH). Kostis Andrikopoulos wrote: > On 11/26/2016 01:11 AM, NIIBE Yutaka wrote: >> Or..., could you give me information on: >> Is the call sequence following? >> >> ... >> -> ecc_generate >> -> _gcry_ecc_eddsa_genkey >> -> _gcry_mpi_ec_mul_point >> -> point_swap_cond >> -> mpi_swap_cond >> -> log_bug >> > Hello again, > > we run the application in gdb and got the following stack trace: > > https://foss.ntua.gr/paste/?09531d0b88a9cd2c#0JMzhgYKcSPzQUtCKmGpUEs1ahW2NtJmye/zb+T7dKo= > > The call sequence seems to be the one you suggested. Thanks a lot. There, you can exampine the MPI values. --------------------------------- GDB session # to go stack frame of _gcry_mpi_swap_cond (gdb) up 4 # then let's see the MPI a and MPI b. (gdb) print *a (gdb) print *b --------------------------------- Could you please show us the values? Here, the condition of log_bug to be called is: a->nlimbs > b->alloced or b->nlimbs > a->alloced This should not occur. There must be something wrong. To locate the bug, let me share my thoughts. Explanation of code in detail to the very bug reporter might sound strange, but it is effective sometimes. In this case, the MPI for point on elliptic curve is allocated in normal memory (not secure one) and resized by point_resize, by 2*ctx->p->nlimbs+1 (which is large enough, IIUC). Once MPI memory is allocated, it never shrinks (IIUC). I think that the possibility of this failure is that _gcry_mpi_assign_limb_space is called somewhere (and replaces MPI memory after initial allocation). I found that it is called in _gcry_mpi_mul when U or V is secure but W is not. I don't think this happens for the case of ECC computation. So, I can't find the cause, so far. * * * >From here, it's too long, I know. You can skip reading this explanation. The function _gcry_mpi_swap_cond is to swap two MPI values conditionally by touching memory of both values. Regardless of swap of values happens or not, execution cycles is same and memory access pattern is same. This is important to mitigate some attacks. MPI structure (struct gcry_mpi, which is defined in src/mpi.h) is like this figure: MPI: <--------------------alloced--------------------> <---------nlimbs-----------------> d ----> [ Limb0 ] [ Limb1 ] ... [ LimbNn ] ... [ LimbNa ] mpi->d points memory of limbs. mpi->alloced is limbs allocated. mpi->nlimbs is valid limbs as a value. To swap MPI values, number of limbs for two MPI values should be similar. But it is still OK when those two MPIs have not exactly same number of limbs. Let's consider with following figure. Before swap: MPI a: <----------alloced-------------------> <---nlimbs------------> d ----> [ Limb0 ] ... [ LimbA ] ... [ LimbA' ] A0 An MPI b: <----------alloced----------------------------> <---nlimbs---------------> d ----> [ Limb0 ] ... [ LimbB ] ... [ LimbB' ] B0 Bn' After swap: MPI a: <----------alloced--------------------> <---nlimbs----------------> d ----> [ Limb0 ] ... [ LimbB ]..[ LimbA' ] B0 Bn' MPI b: <----------alloced----------------------------> <---nlimbs----------------> d ----> [ Limb0 ] ... [ LimbA ] ... [ LimbB' ] A0 An When swap happens, _gcry_mpi_swap_cond puts the contents of limbs of a (A0, ... An, ...) into MPI memory of b and puts the contents of limbs of b (B0, ... Bn', ...) into MPI memory of a. The number of limbs it copies are MIN(a->alloced, b->alloced), it even copies unused limbs and cycles doesn't depend on MPI values. When the number of limbs which _gcry_mpi_swap_cond copies is: nlimbs = MIN(a->alloced, b->alloced) For correct results of values, the condition should be met is: a->nlimbs <= nlimbs && b->nlimbs <= nlimbs That is, valid limbs of MPI a should be smaller than to be copied and valid limbs of MPI b should be smaller than to be copied. Its negation is: a->nlimbs > nlimbs || b->nlimbs > nlimbs Since a->nlimbs < a->alloced and b->nlimbs < b->alloced always, it means that the flollowing condition is met when log_bug is called: a->nlimbs > b->alloced || b->nlimbs > a->alloced That is, the value of MPI a cannot be represented by MPI b or, the value of MPI b cannot be represented by MPI a. Figure of latter condition is like this: MPI a: <----------alloced-------------------> <---nlimbs------------> d ----> [ Limb0 ] ... [ LimbA ] ... [ LimbA' ] A0 An MPI b: <----------alloced--------------------------------------> <---nlimbs-------------------------------> d ----> [ Limb0 ] ... [ LimbB ] ... [ LimbB' ] B0 Bn' It's too large. -- From cvs at cvs.gnupg.org Wed Dec 7 17:04:41 2016 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Wed, 07 Dec 2016 17:04:41 +0100 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.7.3-27-g95bac31 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 95bac312644ad45e486c94c2efd25d0748b9a20b (commit) via b6870cf25c0b1eb9c127a94af8326c446421a472 (commit) via b7df907dca4d525f8930c533b763ffce44ceed87 (commit) via e366c19b34922c770af82cd035fd815680b29dee (commit) from 995ce697308320c6a52a307f83dc49eeb8d784b4 (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 95bac312644ad45e486c94c2efd25d0748b9a20b Author: Werner Koch Date: Wed Dec 7 17:01:19 2016 +0100 Document the overflow pools and add a stupid test case. * tests/t-secmem.c (test_secmem_overflow): New func. (main): Disable warning and call new function. Signed-off-by: Werner Koch diff --git a/doc/gcrypt.texi b/doc/gcrypt.texi index 933d22d..cb539da 100644 --- a/doc/gcrypt.texi +++ b/doc/gcrypt.texi @@ -422,8 +422,11 @@ and freed memory, you need to initialize Libgcrypt this way: process might still be running with increased privileges and that the secure memory has not been initialized. */ - /* Allocate a pool of 16k secure memory. This make the secure memory - available and also drops privileges where needed. */ + /* Allocate a pool of 16k secure memory. This makes the secure memory + available and also drops privileges where needed. Note that by + using functions like gcry_xmalloc_secure and gcry_mpi_snew Libgcrypt + may extend the secure memory pool with memory which lacks the + property of not being swapped out to disk. */ gcry_control (GCRYCTL_INIT_SECMEM, 16384, 0); @anchor{sample-use-resume-secmem} @@ -667,7 +670,10 @@ it right away. This command should be executed right after This command disables the use of the mlock call for secure memory. Disabling the use of mlock may for example be done if an encrypted swap space is in use. This command should be executed right after - at code{gcry_check_version}. + at code{gcry_check_version}. Note that by using functions like +gcry_xmalloc_secure and gcry_mpi_snew Libgcrypt may extend the secure +memory pool with memory which lacks the property of not being swapped +out to disk (but will still be zeroed out on free). @item GCRYCTL_DISABLE_PRIV_DROP; Arguments: none This command sets a global flag to tell the secure memory subsystem diff --git a/tests/t-secmem.c b/tests/t-secmem.c index b464d02..cb2313e 100644 --- a/tests/t-secmem.c +++ b/tests/t-secmem.c @@ -57,6 +57,31 @@ test_secmem (void) } +static void +test_secmem_overflow (void) +{ + void *a[150]; + int i; + + memset (a, 0, sizeof a); + + /* Allocating 150*512=75k should require more than one overflow buffer. */ + for (i=0; i < DIM(a); i++) + { + a[i] = gcry_xmalloc_secure (512); + if (verbose && !(i %40)) + gcry_control (GCRYCTL_DUMP_SECMEM_STATS, 0 , 0); + } + + if (debug) + gcry_control (PRIV_CTL_DUMP_SECMEM_STATS, 0 , 0); + if (verbose) + gcry_control (GCRYCTL_DUMP_SECMEM_STATS, 0 , 0); + for (i=0; i < DIM(a); i++) + xfree (a[i]); +} + + /* This function is called when we ran out of core and there is no way * to return that error to the caller (xmalloc or mpi allocation). */ static int @@ -132,10 +157,24 @@ main (int argc, char **argv) gcry_set_outofcore_handler (outofcore_handler, NULL); gcry_control (GCRYCTL_INITIALIZATION_FINISHED, 0); + /* Libgcrypt prints a warning when the first overflow is allocated; + * we do not want to see that. */ + if (!verbose) + gcry_control (GCRYCTL_DISABLE_SECMEM_WARN, 0); + + test_secmem (); + test_secmem_overflow (); + /* FIXME: We need to improve the tests, for example by registering + * our own log handler and comparing the output of + * PRIV_CTL_DUMP_SECMEM_STATS to expected pattern. */ if (verbose) - gcry_control (PRIV_CTL_DUMP_SECMEM_STATS, 0 , 0); + { + gcry_control (PRIV_CTL_DUMP_SECMEM_STATS, 0 , 0); + gcry_control (GCRYCTL_DUMP_SECMEM_STATS, 0 , 0); + } info ("All tests completed. Errors: %d\n", errorcount); + gcry_control (GCRYCTL_TERM_SECMEM, 0 , 0); return !!errorcount; } commit b6870cf25c0b1eb9c127a94af8326c446421a472 Author: Werner Koch Date: Wed Dec 7 16:59:57 2016 +0100 Implement overflow secmem pools for xmalloc style allocators. * src/secmem.c (pooldesc_s): Add fields next, cur_alloced, and cur_blocks. (cur_alloced, cur_blocks): Remove vars. (ptr_into_pool_p): Make it inline. (stats_update): Add arg pool and update the new pool specific counters. (_gcry_secmem_malloc_internal): Add arg xhint and allocate overflow pools as needed. (_gcry_secmem_malloc): Pass XHINTS along. (_gcry_secmem_realloc_internal): Ditto. (_gcry_secmem_realloc): Ditto. (_gcry_secmem_free_internal): Take multiple pools in account. Add return value to indicate whether the arg was freed. (_gcry_secmem_free): Add return value to indicate whether the arg was freed. (_gcry_private_is_secure): Take multiple pools in account. (_gcry_secmem_term): Release all pools. (_gcry_secmem_dump_stats): Print stats for all pools. * src/stdmem.c (_gcry_private_free): Replace _gcry_private_is_secure test with a direct call of _gcry_secmem_free to avoid double checking. -- This patch avoids process termination due to an out-of-secure-memory condition in the MPI subsystem. We consider it more important to have reliable MPI computations than process termination due the need for memory which is protected against being swapped out. Using encrypted swap is anyway a more reliable protection than those mlock'ed pages. Note also that mlock'ed pages won't help against hibernation. GnuPG-bug-id: 2857 Signed-off-by: Werner Koch diff --git a/src/secmem.c b/src/secmem.c index 928e03f..4fa267b 100644 --- a/src/secmem.c +++ b/src/secmem.c @@ -62,6 +62,10 @@ typedef struct memblock /* An object describing a memory pool. */ typedef struct pooldesc_s { + /* A link to the next pool. This is used to connect the overflow + * pools. */ + struct pooldesc_s *next; + /* A memory buffer used as allocation pool. */ void *mem; @@ -75,10 +79,15 @@ typedef struct pooldesc_s /* Flag indicating whether MEM is mmapped. */ volatile int is_mmapped; + /* The number of allocated bytes and the number of used blocks in + * this pool. */ + unsigned int cur_alloced, cur_blocks; } pooldesc_t; -/* The pool of secure memory. */ +/* The pool of secure memory. This is the head of a linked list with + * the first element being the standard mlock-ed pool and the + * following elements being the overflow pools. */ static pooldesc_t mainpool; @@ -91,9 +100,6 @@ static int suspend_warning; static int no_mlock; static int no_priv_drop; -/* Stats. */ -static unsigned int cur_alloced, cur_blocks; - /* Lock protecting accesses to the memory pools. */ GPGRT_LOCK_DEFINE (secmem_lock); @@ -111,7 +117,7 @@ GPGRT_LOCK_DEFINE (secmem_lock); (memblock_t *) (void *) ((char *) addr - BLOCK_HEAD_SIZE) /* Check whether P points into POOL. */ -static int +static inline int ptr_into_pool_p (pooldesc_t *pool, const void *p) { /* We need to convert pointers to addresses. This is required by @@ -126,17 +132,17 @@ ptr_into_pool_p (pooldesc_t *pool, const void *p) /* Update the stats. */ static void -stats_update (size_t add, size_t sub) +stats_update (pooldesc_t *pool, size_t add, size_t sub) { if (add) { - cur_alloced += add; - cur_blocks++; + pool->cur_alloced += add; + pool->cur_blocks++; } if (sub) { - cur_alloced -= sub; - cur_blocks--; + pool->cur_alloced -= sub; + pool->cur_blocks--; } } @@ -567,7 +573,7 @@ _gcry_secmem_module_init () static void * -_gcry_secmem_malloc_internal (size_t size) +_gcry_secmem_malloc_internal (size_t size, int xhint) { pooldesc_t *pool; memblock_t *mb; @@ -603,9 +609,63 @@ _gcry_secmem_malloc_internal (size_t size) mb = mb_get_new (pool, (memblock_t *) pool->mem, size); if (mb) - stats_update (size, 0); + { + stats_update (pool, size, 0); + return &mb->aligned.c; + } + + /* If we are called from xmalloc style function resort to the + * overflow pools to return memory. We don't do this in FIPS mode, + * though. */ + if (xhint && !fips_mode ()) + { + for (pool = pool->next; pool; pool = pool->next) + { + mb = mb_get_new (pool, (memblock_t *) pool->mem, size); + if (mb) + { + stats_update (pool, size, 0); + return &mb->aligned.c; + } + } + /* Allocate a new overflow pool. We put a new pool right after + * the mainpool so that the next allocation will happen in that + * pool and not in one of the older pools. When this new pool + * gets full we will try to find space in the older pools. */ + pool = calloc (1, sizeof *pool); + if (!pool) + return NULL; /* Not enough memory for a new pool descriptor. */ + pool->size = STANDARD_POOL_SIZE; + pool->mem = malloc (pool->size); + if (!pool->mem) + return NULL; /* Not enough memory available for a new pool. */ + /* Initialize first memory block. */ + mb = (memblock_t *) pool->mem; + mb->size = pool->size; + mb->flags = 0; + + pool->okay = 1; + + /* Take care: in _gcry_private_is_secure we do not lock and thus + * we assume that the second assignment below is atomic. */ + pool->next = mainpool.next; + mainpool.next = pool; + + /* After the first time we allocated an overflow pool, print a + * warning. */ + if (!pool->next) + print_warn (); + + /* Allocate. */ + mb = mb_get_new (pool, (memblock_t *) pool->mem, size); + if (mb) + { + stats_update (pool, size, 0); + return &mb->aligned.c; + } + } - return mb ? &mb->aligned.c : NULL; + return NULL; } @@ -617,20 +677,24 @@ _gcry_secmem_malloc (size_t size, int xhint) void *p; SECMEM_LOCK; - p = _gcry_secmem_malloc_internal (size); + p = _gcry_secmem_malloc_internal (size, xhint); SECMEM_UNLOCK; return p; } -static void +static int _gcry_secmem_free_internal (void *a) { pooldesc_t *pool; memblock_t *mb; int size; - pool = &mainpool; + for (pool = &mainpool; pool; pool = pool->next) + if (pool->okay && ptr_into_pool_p (pool, a)) + break; + if (!pool) + return 0; /* A does not belong to use. */ mb = ADDR_TO_BLOCK (a); size = mb->size; @@ -646,29 +710,35 @@ _gcry_secmem_free_internal (void *a) MB_WIPE_OUT (0x00); /* Update stats. */ - stats_update (0, size); + stats_update (pool, 0, size); mb->flags &= ~MB_FLAG_ACTIVE; - mb_merge (pool, mb); + + return 1; /* Freed. */ } -/* Wipe out and release memory. */ -void + +/* Wipe out and release memory. Returns true if this function + * actually released A. */ +int _gcry_secmem_free (void *a) { + int mine; + if (!a) - return; + return 1; /* Tell caller that we handled it. */ SECMEM_LOCK; - _gcry_secmem_free_internal (a); + mine = _gcry_secmem_free_internal (a); SECMEM_UNLOCK; + return mine; } static void * -_gcry_secmem_realloc_internal (void *p, size_t newsize) +_gcry_secmem_realloc_internal (void *p, size_t newsize, int xhint) { memblock_t *mb; size_t size; @@ -684,7 +754,7 @@ _gcry_secmem_realloc_internal (void *p, size_t newsize) } else { - a = _gcry_secmem_malloc_internal (newsize); + a = _gcry_secmem_malloc_internal (newsize, xhint); if (a) { memcpy (a, p, size); @@ -705,21 +775,27 @@ _gcry_secmem_realloc (void *p, size_t newsize, int xhint) void *a; SECMEM_LOCK; - a = _gcry_secmem_realloc_internal (p, newsize); + a = _gcry_secmem_realloc_internal (p, newsize, xhint); SECMEM_UNLOCK; return a; } -/* Return true if P points into the secure memory area. */ +/* Return true if P points into the secure memory areas. */ int _gcry_private_is_secure (const void *p) { pooldesc_t *pool; - pool = &mainpool; - return pool->okay && ptr_into_pool_p (pool, p); + /* We do no lock here because once a pool is allocatred it will not + * be removed anymore (except for gcry_secmem_term). Further, + * adding a new pool to the list should be atomic. */ + for (pool = &mainpool; pool; pool = pool->next) + if (pool->okay && ptr_into_pool_p (pool, p)) + return 1; + + return 0; } @@ -734,23 +810,33 @@ _gcry_private_is_secure (const void *p) void _gcry_secmem_term () { - pooldesc_t *pool; + pooldesc_t *pool, *next; - pool = &mainpool; - if (!pool->okay) - return; - - wipememory2 (pool->mem, 0xff, pool->size); - wipememory2 (pool->mem, 0xaa, pool->size); - wipememory2 (pool->mem, 0x55, pool->size); - wipememory2 (pool->mem, 0x00, pool->size); + for (pool = &mainpool; pool; pool = next) + { + next = pool->next; + if (!pool->okay) + continue; + + wipememory2 (pool->mem, 0xff, pool->size); + wipememory2 (pool->mem, 0xaa, pool->size); + wipememory2 (pool->mem, 0x55, pool->size); + wipememory2 (pool->mem, 0x00, pool->size); + if (0) + ; #if HAVE_MMAP - if (pool->is_mmapped) - munmap (pool->mem, pool->size); + else if (pool->is_mmapped) + munmap (pool->mem, pool->size); #endif - pool->mem = NULL; - pool->okay = 0; - pool->size = 0; + else + free (pool->mem); + pool->mem = NULL; + pool->okay = 0; + pool->size = 0; + if (pool != &mainpool) + free (pool); + } + mainpool.next = NULL; not_locked = 0; } @@ -762,28 +848,31 @@ _gcry_secmem_dump_stats (int extended) { pooldesc_t *pool; memblock_t *mb; - int i; + int i, poolno; SECMEM_LOCK; - pool = &mainpool; - if (!extended) + for (pool = &mainpool, poolno = 0; pool; pool = pool->next, poolno++) { - if (pool->okay) - log_info ("secmem usage: %u/%lu bytes in %u blocks\n", - cur_alloced, (unsigned long)pool->size, cur_blocks); + if (!extended) + { + if (pool->okay) + log_info ("%-13s %u/%lu bytes in %u blocks\n", + pool == &mainpool? "secmem usage:":"", + pool->cur_alloced, (unsigned long)pool->size, + pool->cur_blocks); + } + else + { + for (i = 0, mb = (memblock_t *) pool->mem; + ptr_into_pool_p (pool, mb); + mb = mb_get_next (pool, mb), i++) + log_info ("SECMEM: pool %d %s block %i size %i\n", + poolno, + (mb->flags & MB_FLAG_ACTIVE) ? "used" : "free", + i, + mb->size); + } } - else - { - for (i = 0, mb = (memblock_t *) pool->mem; - ptr_into_pool_p (pool, mb); - mb = mb_get_next (pool, mb), i++) - log_info ("SECMEM: pool %p %s block %i size %i\n", - pool, - (mb->flags & MB_FLAG_ACTIVE) ? "used" : "free", - i, - mb->size); - } - SECMEM_UNLOCK; } diff --git a/src/secmem.h b/src/secmem.h index c69fe88..29dd64f 100644 --- a/src/secmem.h +++ b/src/secmem.h @@ -25,7 +25,7 @@ void _gcry_secmem_init (size_t npool); void _gcry_secmem_term (void); void *_gcry_secmem_malloc (size_t size, int xhint) _GCRY_GCC_ATTR_MALLOC; void *_gcry_secmem_realloc (void *a, size_t newsize, int xhint); -void _gcry_secmem_free (void *a); +int _gcry_secmem_free (void *a); void _gcry_secmem_dump_stats (int extended); void _gcry_secmem_set_flags (unsigned flags); unsigned _gcry_secmem_get_flags(void); diff --git a/src/stdmem.c b/src/stdmem.c index cf937ff..cbda8d8 100644 --- a/src/stdmem.c +++ b/src/stdmem.c @@ -230,15 +230,13 @@ _gcry_private_free (void *a) if (use_m_guard ) { _gcry_private_check_heap(p); - if ( _gcry_private_is_secure(a) ) - _gcry_secmem_free(p-EXTRA_ALIGN-4); - else + if (! _gcry_secmem_free (p - EXTRA_ALIGN - 4)) { - free(p-EXTRA_ALIGN-4); + free (p - EXTRA_ALIGN - 4); } } - else if ( _gcry_private_is_secure(a) ) - _gcry_secmem_free(p); - else - free(p); + else if (!_gcry_secmem_free (p)) + { + free(p); + } } commit b7df907dca4d525f8930c533b763ffce44ceed87 Author: Werner Koch Date: Wed Dec 7 10:37:50 2016 +0100 Give the secmem allocators a hint when a xmalloc calls them. * src/secmem.c (_gcry_secmem_malloc): New not yet used arg XHINT. (_gcry_secmem_realloc): Ditto. * src/stdmem.c (_gcry_private_malloc_secure): New arg XHINT to be passed to the secmem functions. (_gcry_private_realloc): Ditto. * src/g10lib.h (GCRY_ALLOC_FLAG_XHINT): New. * src/global.c (do_malloc): Pass this flag as XHINT to the private allocator. (_gcry_malloc_secure): Factor code out to ... (_gcry_malloc_secure_core): this. Add arg XHINT. (_gcry_realloc): Factor code out to ... (_gcry_realloc_core): here. Add arg XHINT. (_gcry_strdup): Factor code out to ... (_gcry_strdup_core): here. Add arg XHINT. (_gcry_xrealloc): Use the core function and pass true for XHINT. (_gcry_xmalloc_secure): Ditto. (_gcry_xstrdup): Ditto. Signed-off-by: Werner Koch diff --git a/src/g10lib.h b/src/g10lib.h index d4e3fef..f0a4628 100644 --- a/src/g10lib.h +++ b/src/g10lib.h @@ -392,6 +392,7 @@ gcry_err_code_t _gcry_mpi_init (void); /* Memory management. */ #define GCRY_ALLOC_FLAG_SECURE (1 << 0) +#define GCRY_ALLOC_FLAG_XHINT (1 << 1) /* Called from xmalloc. */ /*-- sexp.c --*/ diff --git a/src/global.c b/src/global.c index be112b7..cfb7618 100644 --- a/src/global.c +++ b/src/global.c @@ -787,7 +787,7 @@ do_malloc (size_t n, unsigned int flags, void **mem) if (alloc_secure_func) m = (*alloc_secure_func) (n); else - m = _gcry_private_malloc_secure (n); + m = _gcry_private_malloc_secure (n, !!(flags & GCRY_ALLOC_FLAG_XHINT)); } else { @@ -821,16 +821,23 @@ _gcry_malloc (size_t n) return mem; } -void * -_gcry_malloc_secure (size_t n) +static void * +_gcry_malloc_secure_core (size_t n, int xhint) { void *mem = NULL; - do_malloc (n, GCRY_ALLOC_FLAG_SECURE, &mem); + do_malloc (n, (GCRY_ALLOC_FLAG_SECURE | (xhint? GCRY_ALLOC_FLAG_XHINT:0)), + &mem); return mem; } +void * +_gcry_malloc_secure (size_t n) +{ + return _gcry_malloc_secure_core (n, 0); +} + int _gcry_is_secure (const void *a) { @@ -855,8 +862,8 @@ _gcry_check_heap( const void *a ) #endif } -void * -_gcry_realloc (void *a, size_t n) +static void * +_gcry_realloc_core (void *a, size_t n, int xhint) { void *p; @@ -873,12 +880,20 @@ _gcry_realloc (void *a, size_t n) if (realloc_func) p = realloc_func (a, n); else - p = _gcry_private_realloc (a, n); + p = _gcry_private_realloc (a, n, xhint); if (!p && !errno) gpg_err_set_errno (ENOMEM); return p; } + +void * +_gcry_realloc (void *a, size_t n) +{ + return _gcry_realloc_core (a, n, 0); +} + + void _gcry_free (void *p) { @@ -941,12 +956,8 @@ _gcry_calloc_secure (size_t n, size_t m) } -/* Create and return a copy of the null-terminated string STRING. If - it is contained in secure memory, the copy will be contained in - secure memory as well. In an out-of-memory condition, NULL is - returned. */ -char * -_gcry_strdup (const char *string) +static char * +_gcry_strdup_core (const char *string, int xhint) { char *string_cp = NULL; size_t string_n = 0; @@ -954,7 +965,7 @@ _gcry_strdup (const char *string) string_n = strlen (string); if (_gcry_is_secure (string)) - string_cp = _gcry_malloc_secure (string_n + 1); + string_cp = _gcry_malloc_secure_core (string_n + 1, xhint); else string_cp = _gcry_malloc (string_n + 1); @@ -964,6 +975,15 @@ _gcry_strdup (const char *string) return string_cp; } +/* Create and return a copy of the null-terminated string STRING. If + * it is contained in secure memory, the copy will be contained in + * secure memory as well. In an out-of-memory condition, NULL is + * returned. */ +char * +_gcry_strdup (const char *string) +{ + return _gcry_strdup_core (string, 0); +} void * _gcry_xmalloc( size_t n ) @@ -987,7 +1007,7 @@ _gcry_xrealloc( void *a, size_t n ) { void *p; - while ( !(p = _gcry_realloc( a, n )) ) + while (!(p = _gcry_realloc_core (a, n, 1))) { if ( fips_mode () || !outofcore_handler @@ -1005,7 +1025,7 @@ _gcry_xmalloc_secure( size_t n ) { void *p; - while ( !(p = _gcry_malloc_secure( n )) ) + while (!(p = _gcry_malloc_secure_core (n, 1))) { if ( fips_mode () || !outofcore_handler @@ -1060,7 +1080,7 @@ _gcry_xstrdup (const char *string) { char *p; - while ( !(p = _gcry_strdup (string)) ) + while ( !(p = _gcry_strdup_core (string, 1)) ) { size_t n = strlen (string); int is_sec = !!_gcry_is_secure (string); diff --git a/src/secmem.c b/src/secmem.c index 54bbda1..928e03f 100644 --- a/src/secmem.c +++ b/src/secmem.c @@ -608,8 +608,11 @@ _gcry_secmem_malloc_internal (size_t size) return mb ? &mb->aligned.c : NULL; } + +/* Allocate a block from the secmem of SIZE. With XHINT set assume + * that the caller is a xmalloc style function. */ void * -_gcry_secmem_malloc (size_t size) +_gcry_secmem_malloc (size_t size, int xhint) { void *p; @@ -694,9 +697,10 @@ _gcry_secmem_realloc_internal (void *p, size_t newsize) } -/* Realloc memory. */ +/* Realloc memory. With XHINT set assume that the caller is a xmalloc + * style function. */ void * -_gcry_secmem_realloc (void *p, size_t newsize) +_gcry_secmem_realloc (void *p, size_t newsize, int xhint) { void *a; diff --git a/src/secmem.h b/src/secmem.h index 764bfeb..c69fe88 100644 --- a/src/secmem.h +++ b/src/secmem.h @@ -23,8 +23,8 @@ void _gcry_secmem_init (size_t npool); void _gcry_secmem_term (void); -void *_gcry_secmem_malloc (size_t size) _GCRY_GCC_ATTR_MALLOC; -void *_gcry_secmem_realloc (void *a, size_t newsize); +void *_gcry_secmem_malloc (size_t size, int xhint) _GCRY_GCC_ATTR_MALLOC; +void *_gcry_secmem_realloc (void *a, size_t newsize, int xhint); void _gcry_secmem_free (void *a); void _gcry_secmem_dump_stats (int extended); void _gcry_secmem_set_flags (unsigned flags); diff --git a/src/stdmem.c b/src/stdmem.c index 189da37..cf937ff 100644 --- a/src/stdmem.c +++ b/src/stdmem.c @@ -117,10 +117,11 @@ _gcry_private_malloc (size_t n) /* * Allocate memory of size N from the secure memory pool. Return NULL - * if we are out of memory. + * if we are out of memory. XHINT tells the allocator that the caller + * used an xmalloc style call. */ void * -_gcry_private_malloc_secure (size_t n) +_gcry_private_malloc_secure (size_t n, int xhint) { if (!n) { @@ -133,7 +134,7 @@ _gcry_private_malloc_secure (size_t n) { char *p; - if ( !(p = _gcry_secmem_malloc (n +EXTRA_ALIGN+ 5)) ) + if (!(p = _gcry_secmem_malloc (n + EXTRA_ALIGN + 5, xhint))) return NULL; ((byte*)p)[EXTRA_ALIGN+0] = n; ((byte*)p)[EXTRA_ALIGN+1] = n >> 8 ; @@ -144,17 +145,18 @@ _gcry_private_malloc_secure (size_t n) } else { - return _gcry_secmem_malloc( n ); + return _gcry_secmem_malloc (n, xhint); } } /* - * Realloc and clear the old space - * Return NULL if there is not enough memory. + * Realloc and clear the old space. XHINT tells the allocator that + * the caller used an xmalloc style call. Returns NULL if there is + * not enough memory. */ void * -_gcry_private_realloc ( void *a, size_t n ) +_gcry_private_realloc (void *a, size_t n, int xhint) { if (use_m_guard) { @@ -172,7 +174,7 @@ _gcry_private_realloc ( void *a, size_t n ) if( len >= n ) /* We don't shrink for now. */ return a; if (p[-1] == MAGIC_SEC_BYTE) - b = _gcry_private_malloc_secure(n); + b = _gcry_private_malloc_secure (n, xhint); else b = _gcry_private_malloc(n); if (!b) @@ -184,7 +186,7 @@ _gcry_private_realloc ( void *a, size_t n ) } else if ( _gcry_private_is_secure(a) ) { - return _gcry_secmem_realloc( a, n ); + return _gcry_secmem_realloc (a, n, xhint); } else { diff --git a/src/stdmem.h b/src/stdmem.h index b476e7e..c52aab5 100644 --- a/src/stdmem.h +++ b/src/stdmem.h @@ -24,8 +24,8 @@ void _gcry_private_enable_m_guard(void); void *_gcry_private_malloc (size_t n) _GCRY_GCC_ATTR_MALLOC; -void *_gcry_private_malloc_secure (size_t n) _GCRY_GCC_ATTR_MALLOC; -void *_gcry_private_realloc (void *a, size_t n); +void *_gcry_private_malloc_secure (size_t n, int xhint) _GCRY_GCC_ATTR_MALLOC; +void *_gcry_private_realloc (void *a, size_t n, int xhint); void _gcry_private_check_heap (const void *a); void _gcry_private_free (void *a); commit e366c19b34922c770af82cd035fd815680b29dee Author: Werner Koch Date: Wed Dec 7 10:01:39 2016 +0100 tests: New test t-secmem. * src/secmem.c (_gcry_secmem_dump_stats): Add arg EXTENDED and adjust caller. * src/gcrypt-testapi.h (PRIV_CTL_DUMP_SECMEM_STATS): New. * src/global.c (_gcry_vcontrol): Implement that. * tests/t-secmem.c: New. * tests/Makefile.am (tests_bin): Add that test. -- This test does not much right now. Signed-off-by: Werner Koch diff --git a/src/gcrypt-testapi.h b/src/gcrypt-testapi.h index 23d3800..0417754 100644 --- a/src/gcrypt-testapi.h +++ b/src/gcrypt-testapi.h @@ -31,6 +31,7 @@ #define PRIV_CTL_RUN_EXTRNG_TEST 59 #define PRIV_CTL_DEINIT_EXTRNG_TEST 60 #define PRIV_CTL_EXTERNAL_LOCK_TEST 61 +#define PRIV_CTL_DUMP_SECMEM_STATS 62 #define EXTERNAL_LOCK_TEST_INIT 30111 #define EXTERNAL_LOCK_TEST_LOCK 30112 diff --git a/src/global.c b/src/global.c index 8e54efe..be112b7 100644 --- a/src/global.c +++ b/src/global.c @@ -380,7 +380,7 @@ _gcry_vcontrol (enum gcry_ctl_cmds cmd, va_list arg_ptr) break; case GCRYCTL_DUMP_SECMEM_STATS: - _gcry_secmem_dump_stats (); + _gcry_secmem_dump_stats (0); break; case GCRYCTL_DROP_PRIVS: @@ -613,7 +613,8 @@ _gcry_vcontrol (enum gcry_ctl_cmds cmd, va_list arg_ptr) case PRIV_CTL_EXTERNAL_LOCK_TEST: /* Run external lock test */ rc = external_lock_test (va_arg (arg_ptr, int)); break; - case 62: /* RFU */ + case PRIV_CTL_DUMP_SECMEM_STATS: + _gcry_secmem_dump_stats (1); break; #if _GCRY_GCC_VERSION >= 40600 # pragma GCC diagnostic pop diff --git a/src/secmem.c b/src/secmem.c index 1f92f17..54bbda1 100644 --- a/src/secmem.c +++ b/src/secmem.c @@ -751,33 +751,35 @@ _gcry_secmem_term () } +/* Print stats of the secmem allocator. With EXTENDED passwed as true + * a detiled listing is returned (used for testing). */ void -_gcry_secmem_dump_stats () +_gcry_secmem_dump_stats (int extended) { pooldesc_t *pool; - -#if 1 - SECMEM_LOCK; - - pool = &mainpool; - if (pool->okay) - log_info ("secmem usage: %u/%lu bytes in %u blocks\n", - cur_alloced, (unsigned long)pool->size, cur_blocks); - SECMEM_UNLOCK; -#else memblock_t *mb; int i; SECMEM_LOCK; pool = &mainpool; - for (i = 0, mb = (memblock_t *) pool->mem; - ptr_into_pool_p (pool, mb); - mb = mb_get_next (pool, mb), i++) - log_info ("SECMEM: [%s] block: %i; size: %i\n", - (mb->flags & MB_FLAG_ACTIVE) ? "used" : "free", - i, - mb->size); + if (!extended) + { + if (pool->okay) + log_info ("secmem usage: %u/%lu bytes in %u blocks\n", + cur_alloced, (unsigned long)pool->size, cur_blocks); + } + else + { + for (i = 0, mb = (memblock_t *) pool->mem; + ptr_into_pool_p (pool, mb); + mb = mb_get_next (pool, mb), i++) + log_info ("SECMEM: pool %p %s block %i size %i\n", + pool, + (mb->flags & MB_FLAG_ACTIVE) ? "used" : "free", + i, + mb->size); + } + SECMEM_UNLOCK; -#endif } diff --git a/src/secmem.h b/src/secmem.h index 3577381..764bfeb 100644 --- a/src/secmem.h +++ b/src/secmem.h @@ -26,7 +26,7 @@ void _gcry_secmem_term (void); void *_gcry_secmem_malloc (size_t size) _GCRY_GCC_ATTR_MALLOC; void *_gcry_secmem_realloc (void *a, size_t newsize); void _gcry_secmem_free (void *a); -void _gcry_secmem_dump_stats (void); +void _gcry_secmem_dump_stats (int extended); void _gcry_secmem_set_flags (unsigned flags); unsigned _gcry_secmem_get_flags(void); int _gcry_private_is_secure (const void *p); diff --git a/tests/Makefile.am b/tests/Makefile.am index d462f30..374e72e 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -19,7 +19,7 @@ ## Process this file with automake to produce Makefile.in tests_bin = \ - version mpitests t-sexp t-convert \ + version t-secmem mpitests t-sexp t-convert \ t-mpi-bit t-mpi-point curves t-lock \ prime basic keygen pubkey hmac hashtest t-kdf keygrip \ fips186-dsa aeswrap pkcs1v2 random dsa-rfc6979 t-ed25519 t-cv25519 diff --git a/tests/t-secmem.c b/tests/t-secmem.c new file mode 100644 index 0000000..b464d02 --- /dev/null +++ b/tests/t-secmem.c @@ -0,0 +1,141 @@ +/* t-secmem.c - Test the secmem memory allocator + * Copyright (C) 2016 g10 Code GmbH + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#ifdef HAVE_CONFIG_H +# include +#endif +#include +#include +#include +#include +#include + +#define PGMNAME "t-secmem" + +#include "t-common.h" +#include "../src/gcrypt-testapi.h" + + +static void +test_secmem (void) +{ + void *a[28]; + void *b; + int i; + + memset (a, 0, sizeof a); + + /* Allocating 28*512=14k should work in the default 16k pool even + * with extrem alignment requirements. */ + for (i=0; i < DIM(a); i++) + a[i] = gcry_xmalloc_secure (512); + + /* Allocating another 2k should fail for the default 16k pool. */ + b = gcry_malloc_secure (2048); + if (b) + fail ("allocation did not fail as expected\n"); + + for (i=0; i < DIM(a); i++) + xfree (a[i]); + xfree (b); +} + + +/* This function is called when we ran out of core and there is no way + * to return that error to the caller (xmalloc or mpi allocation). */ +static int +outofcore_handler (void *opaque, size_t req_n, unsigned int flags) +{ + static int been_here; /* Used to protect against recursive calls. */ + + (void)opaque; + + /* Protect against a second call. */ + if (been_here) + return 0; /* Let libgcrypt call its own fatal error handler. */ + been_here = 1; + + info ("outofcore handler invoked"); + gcry_control (PRIV_CTL_DUMP_SECMEM_STATS, 0 , 0); + fail ("out of core%s while allocating %lu bytes", + (flags & 1)?" in secure memory":"", (unsigned long)req_n); + + die ("stopped"); + /*NOTREACHED*/ + return 0; +} + + +int +main (int argc, char **argv) +{ + int last_argc = -1; + + if (argc) + { argc--; argv++; } + + while (argc && last_argc != argc ) + { + last_argc = argc; + if (!strcmp (*argv, "--")) + { + argc--; argv++; + break; + } + else if (!strcmp (*argv, "--help")) + { + fputs ("usage: " PGMNAME " [options]\n" + "Options:\n" + " --verbose print timings etc.\n" + " --debug flyswatter\n" + , stdout); + exit (0); + } + else if (!strcmp (*argv, "--verbose")) + { + verbose++; + argc--; argv++; + } + else if (!strcmp (*argv, "--debug")) + { + verbose += 2; + debug++; + argc--; argv++; + } + else if (!strncmp (*argv, "--", 2)) + die ("unknown option '%s'", *argv); + } + + if (!gcry_check_version (GCRYPT_VERSION)) + die ("version mismatch; pgm=%s, library=%s\n", + GCRYPT_VERSION, gcry_check_version (NULL)); + if (debug) + gcry_control (GCRYCTL_SET_DEBUG_FLAGS, 1u , 0); + gcry_control (GCRYCTL_ENABLE_QUICK_RANDOM, 0); + gcry_control (GCRYCTL_INIT_SECMEM, 16384, 0); + gcry_set_outofcore_handler (outofcore_handler, NULL); + gcry_control (GCRYCTL_INITIALIZATION_FINISHED, 0); + + test_secmem (); + + if (verbose) + gcry_control (PRIV_CTL_DUMP_SECMEM_STATS, 0 , 0); + info ("All tests completed. Errors: %d\n", errorcount); + return !!errorcount; +} ----------------------------------------------------------------------- Summary of changes: doc/gcrypt.texi | 12 ++- src/g10lib.h | 1 + src/gcrypt-testapi.h | 1 + src/global.c | 59 ++++++++----- src/secmem.c | 227 ++++++++++++++++++++++++++++++++++++--------------- src/secmem.h | 8 +- src/stdmem.c | 34 ++++---- src/stdmem.h | 4 +- tests/Makefile.am | 2 +- tests/t-secmem.c | 180 ++++++++++++++++++++++++++++++++++++++++ 10 files changed, 416 insertions(+), 112 deletions(-) create mode 100644 tests/t-secmem.c hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Thu Dec 8 09:11:42 2016 From: cvs at cvs.gnupg.org (by Stephan Mueller) Date: Thu, 08 Dec 2016 09:11:42 +0100 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.7.3-32-g656395b Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 656395ba4cf34f42dda3a120bda3ed1220755a3d (commit) via 20886fdcb841b0bf89bb1d44303d42f1804e38cb (commit) via 227099f179df9dcf083d0ef6be9883c775df0874 (commit) via df8634d8e2b595430dc1e6575a7452c242cffca1 (commit) via 677ddf5bbd9c172a72607c7d5d7006907071c2cf (commit) from 95bac312644ad45e486c94c2efd25d0748b9a20b (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 656395ba4cf34f42dda3a120bda3ed1220755a3d Author: Stephan Mueller Date: Sat Dec 3 19:18:01 2016 +0100 random: Eliminate unneeded memcpy invocations in the DRBG. * random/random-drbg.c (drbg_hash): Remove arg 'outval' and return a pointer instead. (drbg_instantiate): Reduce size of scratchpad. (drbg_hmac_update): Avoid use of scratch buffers for the hash. (drbg_hmac_generate, drbg_hash_df): Ditto. (drbg_hash_process_addtl): Ditto. (drbg_hash_hashgen): Ditto. (drbg_hash_generate): Ditto. -- The gcry_md_read returns a pointer to the hash which can be directly used instead of copying it into a scratch buffer. This eliminates a number of memcpy invocations for HMAC and Hash DRBG and reduces the memory footprint of the Hash DRBG by the block size of the used hash. The performance increase is between 1 and 3 MB/s depending on the output buffer size. Signed-off-by: Stephan Mueller ChangeLog entries above written by -wk. diff --git a/random/random-drbg.c b/random/random-drbg.c index dc8e8f3..e2fe861 100644 --- a/random/random-drbg.c +++ b/random/random-drbg.c @@ -374,9 +374,7 @@ static gpg_err_code_t drbg_hmac_init (drbg_state_t drbg); static gpg_err_code_t drbg_hmac_setkey (drbg_state_t drbg, const unsigned char *key); static void drbg_hash_fini (drbg_state_t drbg); -static gpg_err_code_t drbg_hash (drbg_state_t drbg, - unsigned char *outval, - const drbg_string_t *buf); +static byte *drbg_hash (drbg_state_t drbg, const drbg_string_t *buf); static gpg_err_code_t drbg_sym_init (drbg_state_t drbg); static void drbg_sym_fini (drbg_state_t drbg); static gpg_err_code_t drbg_sym_setkey (drbg_state_t drbg, @@ -1042,24 +1040,21 @@ drbg_hmac_update (drbg_state_t drbg, drbg_string_t *seed, int reseed) /* we execute two rounds of V/K massaging */ for (i = 2; 0 < i; i--) { + byte *retval; /* first round uses 0x0, second 0x1 */ unsigned char prefix = DRBG_PREFIX0; if (1 == i) prefix = DRBG_PREFIX1; /* 10.1.2.2 step 1 and 4 -- concatenation and HMAC for key */ seed2.buf = &prefix; - ret = drbg_hash (drbg, drbg->C, &seed1); - if (ret) - return ret; - - ret = drbg_hmac_setkey (drbg, drbg->C); + retval = drbg_hash (drbg, &seed1); + ret = drbg_hmac_setkey (drbg, retval); if (ret) return ret; /* 10.1.2.2 step 2 and 5 -- HMAC for V */ - ret = drbg_hash (drbg, drbg->V, &cipherin); - if (ret) - return ret; + retval = drbg_hash (drbg, &cipherin); + memcpy(drbg->V, retval, drbg_blocklen (drbg)); /* 10.1.2.2 step 3 */ if (!seed || 0 == seed->len) @@ -1091,9 +1086,8 @@ drbg_hmac_generate (drbg_state_t drbg, unsigned char *buf, unsigned int buflen, { unsigned int outlen = 0; /* 10.1.2.5 step 4.1 */ - ret = drbg_hash (drbg, drbg->V, &data); - if (ret) - return ret; + byte *retval = drbg_hash (drbg, &data); + memcpy(drbg->V, retval, drbg_blocklen (drbg)); outlen = (drbg_blocklen (drbg) < (buflen - len)) ? drbg_blocklen (drbg) : (buflen - len); @@ -1137,14 +1131,10 @@ drbg_hash_df (drbg_state_t drbg, unsigned char *outval, size_t outlen, drbg_string_t *entropy) { - gpg_err_code_t ret = 0; size_t len = 0; unsigned char input[5]; - unsigned char *tmp = drbg->scratchpad + drbg_statelen (drbg); drbg_string_t data1; - memset (tmp, 0, drbg_blocklen (drbg)); - /* 10.4.1 step 3 */ input[0] = 1; drbg_cpu_to_be32 ((outlen * 8), &input[1]); @@ -1158,20 +1148,16 @@ drbg_hash_df (drbg_state_t drbg, { short blocklen = 0; /* 10.4.1 step 4.1 */ - ret = drbg_hash (drbg, tmp, &data1); - if (ret) - goto out; + byte *retval = drbg_hash (drbg, &data1); /* 10.4.1 step 4.2 */ input[0]++; blocklen = (drbg_blocklen (drbg) < (outlen - len)) ? drbg_blocklen (drbg) : (outlen - len); - memcpy (outval + len, tmp, blocklen); + memcpy (outval + len, retval, blocklen); len += blocklen; } - out: - memset (tmp, 0, drbg_blocklen (drbg)); - return ret; + return 0; } /* update function for Hash DRBG as defined in 10.1.1.2 / 10.1.1.3 */ @@ -1227,13 +1213,10 @@ drbg_hash_update (drbg_state_t drbg, drbg_string_t *seed, int reseed) static gpg_err_code_t drbg_hash_process_addtl (drbg_state_t drbg, drbg_string_t *addtl) { - gpg_err_code_t ret = 0; drbg_string_t data1, data2; drbg_string_t *data3; unsigned char prefix = DRBG_PREFIX2; - - /* this is value w as per documentation */ - memset (drbg->scratchpad, 0, drbg_blocklen (drbg)); + byte *retval; /* 10.1.1.4 step 2 */ if (!addtl || 0 == addtl->len) @@ -1247,37 +1230,25 @@ drbg_hash_process_addtl (drbg_state_t drbg, drbg_string_t *addtl) data2.next = data3; data3->next = NULL; /* 10.1.1.4 step 2a -- cipher invocation */ - ret = drbg_hash (drbg, drbg->scratchpad, &data1); - if (ret) - goto out; + retval = drbg_hash (drbg, &data1); /* 10.1.1.4 step 2b */ - drbg_add_buf (drbg->V, drbg_statelen (drbg), - drbg->scratchpad, drbg_blocklen (drbg)); + drbg_add_buf (drbg->V, drbg_statelen (drbg), retval, drbg_blocklen (drbg)); - out: - memset (drbg->scratchpad, 0, drbg_blocklen (drbg)); - return ret; + return 0; } /* * Hashgen defined in 10.1.1.4 */ static gpg_err_code_t -drbg_hash_hashgen (drbg_state_t drbg, - unsigned char *buf, unsigned int buflen) +drbg_hash_hashgen (drbg_state_t drbg, unsigned char *buf, unsigned int buflen) { - gpg_err_code_t ret = 0; unsigned int len = 0; unsigned char *src = drbg->scratchpad; - unsigned char *dst = drbg->scratchpad + drbg_statelen (drbg); drbg_string_t data; unsigned char prefix = DRBG_PREFIX1; - /* use the scratchpad as a lookaside buffer */ - memset (src, 0, drbg_statelen (drbg)); - memset (dst, 0, drbg_blocklen (drbg)); - /* 10.1.1.4 step hashgen 2 */ memcpy (src, drbg->V, drbg_statelen (drbg)); @@ -1286,44 +1257,36 @@ drbg_hash_hashgen (drbg_state_t drbg, { unsigned int outlen = 0; /* 10.1.1.4 step hashgen 4.1 */ - ret = drbg_hash (drbg, dst, &data); - if (ret) - goto out; + byte *retval = drbg_hash (drbg, &data); outlen = (drbg_blocklen (drbg) < (buflen - len)) ? drbg_blocklen (drbg) : (buflen - len); /* 10.1.1.4 step hashgen 4.2 */ - memcpy (buf + len, dst, outlen); + memcpy (buf + len, retval, outlen); len += outlen; /* 10.1.1.4 hashgen step 4.3 */ if (len < buflen) drbg_add_buf (src, drbg_statelen (drbg), &prefix, 1); } - out: - memset (drbg->scratchpad, 0, - (drbg_statelen (drbg) + drbg_blocklen (drbg))); - return ret; + memset (drbg->scratchpad, 0, drbg_statelen (drbg)); + return 0; } /* Generate function for Hash DRBG as defined in 10.1.1.4 */ static gpg_err_code_t -drbg_hash_generate (drbg_state_t drbg, - unsigned char *buf, unsigned int buflen, - drbg_string_t *addtl) +drbg_hash_generate (drbg_state_t drbg, unsigned char *buf, unsigned int buflen, + drbg_string_t *addtl) { - gpg_err_code_t ret = 0; + gpg_err_code_t ret; unsigned char prefix = DRBG_PREFIX3; drbg_string_t data1, data2; + byte *retval; union { unsigned char req[8]; u64 req_int; } u; - /* - * scratchpad usage: drbg_hash_process_addtl uses the scratchpad, but - * fully completes before returning. Thus, we can reuse the scratchpad - */ /* 10.1.1.4 step 2 */ ret = drbg_hash_process_addtl (drbg, addtl); if (ret) @@ -1334,27 +1297,20 @@ drbg_hash_generate (drbg_state_t drbg, if (ret) return ret; - /* this is the value H as documented in 10.1.1.4 */ - memset (drbg->scratchpad, 0, drbg_blocklen (drbg)); /* 10.1.1.4 step 4 */ drbg_string_fill (&data1, &prefix, 1); drbg_string_fill (&data2, drbg->V, drbg_statelen (drbg)); data1.next = &data2; - ret = drbg_hash (drbg, drbg->scratchpad, &data1); - if (ret) - goto out; + + /* this is the value H as documented in 10.1.1.4 */ + retval = drbg_hash (drbg, &data1); /* 10.1.1.4 step 5 */ - drbg_add_buf (drbg->V, drbg_statelen (drbg), - drbg->scratchpad, drbg_blocklen (drbg)); - drbg_add_buf (drbg->V, drbg_statelen (drbg), drbg->C, - drbg_statelen (drbg)); + drbg_add_buf (drbg->V, drbg_statelen (drbg), retval, drbg_blocklen (drbg)); + drbg_add_buf (drbg->V, drbg_statelen (drbg), drbg->C, drbg_statelen (drbg)); u.req_int = be_bswap64 (drbg->reseed_ctr); - drbg_add_buf (drbg->V, drbg_statelen (drbg), u.req, - sizeof (u.req)); + drbg_add_buf (drbg->V, drbg_statelen (drbg), u.req, sizeof (u.req)); - out: - memset (drbg->scratchpad, 0, drbg_blocklen (drbg)); return ret; } @@ -1699,7 +1655,7 @@ drbg_instantiate (drbg_state_t drbg, drbg_blocklen (drbg) + /* iv */ drbg_statelen (drbg) + drbg_blocklen (drbg); /* temp */ else - sb_size = drbg_statelen (drbg) + drbg_blocklen (drbg); + sb_size = drbg_statelen (drbg); if (0 < sb_size) { @@ -2626,8 +2582,8 @@ drbg_hash_fini (drbg_state_t drbg) _gcry_md_close (hd); } -static gpg_err_code_t -drbg_hash (drbg_state_t drbg, unsigned char *outval, const drbg_string_t *buf) +static byte * +drbg_hash (drbg_state_t drbg, const drbg_string_t *buf) { gcry_md_hd_t hd = (gcry_md_hd_t)drbg->priv_data; @@ -2635,9 +2591,7 @@ drbg_hash (drbg_state_t drbg, unsigned char *outval, const drbg_string_t *buf) for (; NULL != buf; buf = buf->next) _gcry_md_write (hd, buf->buf, buf->len); _gcry_md_final (hd); - memcpy (outval, _gcry_md_read (hd, drbg->core->backend_cipher), - drbg_blocklen (drbg)); - return 0; + return _gcry_md_read (hd, drbg->core->backend_cipher); } static void commit 20886fdcb841b0bf89bb1d44303d42f1804e38cb Author: Stephan Mueller Date: Thu Dec 1 17:15:10 2016 +0100 random: Add performance improvements for the DRBG. * random/random-drbg.c (struct drbg_state_ops_s): New function pointers 'crypto_init' and 'crypto-fini'. (struct drbg_state_s): New fields 'priv_data', 'ctr_handle', and 'ctr_null'. (drbg_hash_init, drbg_hash_fini): New. (drbg_hmac_init, drbg_hmac_setkey): New. (drbg_sym_fini, drbg_sym_init, drbg_sym_setkey): New. (drbg_sym_ctr): New. (drbg_ctr_bcc): Set the key. (drbg_ctr_df): Ditto. (drbg_hmac_update): Ditto. (drbg_hmac_generate): Replace drgb_hmac by drbg_hash. (drbg_hash_df): Ditto. (drbg_hash_process_addtl): Ditto. (drbg_hash_hashgen): Ditto. (drbg_ctr_update): Rework. (drbg_ctr_generate): Rework. (drbg_ctr_ops): Init new functions pointers. (drbg_uninstantiate): Call fini function. (drbg_instantiate): Call init function. -- The performance improvements can be categorized as follows: * Initialize the cipher handle of the backend ciphers once and re-use them for subsequent cipher invocations. * Limit the invocation of setkey to the cases when the key is newly created. * Use the AES CTR mode and rip out the counter maintenance in the DRBG code. This allows the use of accelerated CTR AES implementations. To use the CTR AES mode, a NULL buffer is created that is used as the "plaintext" to the CTR mode, because the DRBG CTR AES operation is the result of the encryption of the CTR (i.e. the NULL buffer makes the final XOR of the CTR AES mode a noop). The following timing measurements are made. The measurement do not use a precise timing operation and should rather serve as a general hint to the performance improvements. On a Broadwell i7 CPU: block size 4096 1024 128 32 16 aes256 old 28MB/s 27MB/s 19MB/s 11MB/s 6MB/s aes128 old 29MB/s 32MB/s 23MB/s 15MB/s 9MB/s sha256 old 48MB/s 48MB/s 33MB/s 16MB/s 8MB/s hmac sha256 old 15MB/s 15MB/s 10MB/s 5MB/s 2MB/s aes256 new 180MB/s 169MB/s 93MB/s 37MB/s 20MB/s aes128 new 240MB/s 221MB/s 125MB/s 51MB/s 27MB/s sha256 new 75MB/s 69MB/s 48MB/s 23MB/s 11MB/s hmac sha256 new 37MB/s 34MB/s 21MB/s 8MB/s 4MB/s Signed-off-by: Stephan Mueller ChnageLog entries above written by -wk diff --git a/random/random-drbg.c b/random/random-drbg.c index 9676f0e..dc8e8f3 100644 --- a/random/random-drbg.c +++ b/random/random-drbg.c @@ -289,6 +289,8 @@ struct drbg_state_ops_s gpg_err_code_t (*generate) (drbg_state_t drbg, unsigned char *buf, unsigned int buflen, drbg_string_t *addtl); + gpg_err_code_t (*crypto_init) (drbg_state_t drbg); + void (*crypto_fini) (drbg_state_t drbg); }; struct drbg_test_data_s @@ -309,6 +311,10 @@ struct drbg_state_s * 10.1.1.1 1c) */ unsigned char *scratchpad; /* some memory the DRBG can use for its * operation -- allocated during init */ + void *priv_data; /* Cipher handle */ + gcry_cipher_hd_t ctr_handle; /* CTR mode cipher handle */ +#define DRBG_CTR_NULL_LEN 128 + unsigned char *ctr_null; /* CTR mode zero buffer */ int seeded:1; /* DRBG fully seeded? */ int pr:1; /* Prediction resistance enabled? */ /* Taken from libgcrypt ANSI X9.31 DRNG: We need to keep track of the @@ -363,14 +369,23 @@ static const struct drbg_core_s drbg_cores[] = { {DRBG_CTRAES | DRBG_SYM256, 48, 16, GCRY_CIPHER_AES256} }; -static gpg_err_code_t drbg_sym (drbg_state_t drbg, - const unsigned char *key, - unsigned char *outval, - const drbg_string_t *buf); -static gpg_err_code_t drbg_hmac (drbg_state_t drbg, - const unsigned char *key, +static gpg_err_code_t drbg_hash_init (drbg_state_t drbg); +static gpg_err_code_t drbg_hmac_init (drbg_state_t drbg); +static gpg_err_code_t drbg_hmac_setkey (drbg_state_t drbg, + const unsigned char *key); +static void drbg_hash_fini (drbg_state_t drbg); +static gpg_err_code_t drbg_hash (drbg_state_t drbg, unsigned char *outval, const drbg_string_t *buf); +static gpg_err_code_t drbg_sym_init (drbg_state_t drbg); +static void drbg_sym_fini (drbg_state_t drbg); +static gpg_err_code_t drbg_sym_setkey (drbg_state_t drbg, + const unsigned char *key); +static gpg_err_code_t drbg_sym (drbg_state_t drbg, unsigned char *outval, + const drbg_string_t *buf); +static gpg_err_code_t drbg_sym_ctr (drbg_state_t drbg, + const unsigned char *inbuf, unsigned int inbuflen, + unsigned char *outbuf, unsigned int outbuflen); /****************************************************************** ****************************************************************** @@ -666,6 +681,10 @@ drbg_ctr_bcc (drbg_state_t drbg, /* 10.4.3 step 1 */ memset (out, 0, drbg_blocklen (drbg)); + ret = drbg_sym_setkey(drbg, key); + if (ret) + return ret; + /* 10.4.3 step 2 / 4 */ while (inpos) { @@ -698,7 +717,7 @@ drbg_ctr_bcc (drbg_state_t drbg, } } /* 10.4.3 step 4.2 */ - ret = drbg_sym (drbg, key, out, &data); + ret = drbg_sym (drbg, out, &data); if (ret) return ret; /* 10.4.3 step 2 */ @@ -839,6 +858,9 @@ drbg_ctr_df (drbg_state_t drbg, unsigned char *df_data, /* 10.4.2 step 12: overwriting of outval */ /* 10.4.2 step 13 */ + ret = drbg_sym_setkey(drbg, temp); + if (ret) + goto out; while (generated_len < bytes_to_return) { short blocklen = 0; @@ -846,11 +868,10 @@ drbg_ctr_df (drbg_state_t drbg, unsigned char *df_data, /* the truncation of the key length is implicit as the key * is only drbg_blocklen in size -- check for the implementation * of the cipher function callback */ - ret = drbg_sym (drbg, temp, X, &cipherin); + ret = drbg_sym (drbg, X, &cipherin); if (ret) goto out; - blocklen = (drbg_blocklen (drbg) < - (bytes_to_return - generated_len)) ? + blocklen = (drbg_blocklen (drbg) < (bytes_to_return - generated_len)) ? drbg_blocklen (drbg) : (bytes_to_return - generated_len); /* 10.4.2 step 13.2 and 14 */ memcpy (df_data + generated_len, X, blocklen); @@ -889,54 +910,51 @@ drbg_ctr_update (drbg_state_t drbg, drbg_string_t *addtl, int reseed) unsigned char *temp = drbg->scratchpad; unsigned char *df_data = drbg->scratchpad + drbg_statelen (drbg) + drbg_blocklen (drbg); - unsigned char *temp_p, *df_data_p; /* pointer to iterate over buffers */ - unsigned int len = 0; - drbg_string_t cipherin; unsigned char prefix = DRBG_PREFIX1; memset (temp, 0, drbg_statelen (drbg) + drbg_blocklen (drbg)); if (3 > reseed) memset (df_data, 0, drbg_statelen (drbg)); - /* 10.2.1.3.2 step 2 and 10.2.1.4.2 step 2 */ - if (addtl && 0 < addtl->len) + if (!reseed) { - ret = - drbg_ctr_df (drbg, df_data, drbg_statelen (drbg), addtl); + /* + * The DRBG uses the CTR mode of the underlying AES cipher. The + * CTR mode increments the counter value after the AES operation + * but SP800-90A requires that the counter is incremented before + * the AES operation. Hence, we increment it at the time we set + * it by one. + */ + drbg_add_buf (drbg->V, drbg_blocklen (drbg), &prefix, 1); + + ret = _gcry_cipher_setkey (drbg->ctr_handle, drbg->C, drbg_keylen (drbg)); if (ret) - goto out; + goto out; } - drbg_string_fill (&cipherin, drbg->V, drbg_blocklen (drbg)); - /* 10.2.1.3.2 step 2 and 3 -- are already covered as we memset(0) - * all memory during initialization */ - while (len < (drbg_statelen (drbg))) + /* 10.2.1.3.2 step 2 and 10.2.1.4.2 step 2 */ + if (addtl && 0 < addtl->len) { - /* 10.2.1.2 step 2.1 */ - drbg_add_buf (drbg->V, drbg_blocklen (drbg), &prefix, 1); - /* 10.2.1.2 step 2.2 */ - /* using target of temp + len: 10.2.1.2 step 2.3 and 3 */ - ret = drbg_sym (drbg, drbg->C, temp + len, &cipherin); + ret = + drbg_ctr_df (drbg, df_data, drbg_statelen (drbg), addtl); if (ret) goto out; - /* 10.2.1.2 step 2.3 and 3 */ - len += drbg_blocklen (drbg); } - /* 10.2.1.2 step 4 */ - temp_p = temp; - df_data_p = df_data; - for (len = 0; len < drbg_statelen (drbg); len++) - { - *temp_p ^= *df_data_p; - df_data_p++; - temp_p++; - } + ret = drbg_sym_ctr (drbg, df_data, drbg_statelen(drbg), + temp, drbg_statelen(drbg)); + if (ret) + goto out; /* 10.2.1.2 step 5 */ - memcpy (drbg->C, temp, drbg_keylen (drbg)); + ret = _gcry_cipher_setkey (drbg->ctr_handle, temp, drbg_keylen (drbg)); + if (ret) + goto out; + /* 10.2.1.2 step 6 */ memcpy (drbg->V, temp + drbg_keylen (drbg), drbg_blocklen (drbg)); + /* See above: increment counter by one to compensate timing of CTR op */ + drbg_add_buf (drbg->V, drbg_blocklen (drbg), &prefix, 1); ret = 0; out: @@ -957,9 +975,6 @@ drbg_ctr_generate (drbg_state_t drbg, drbg_string_t *addtl) { gpg_err_code_t ret = 0; - unsigned int len = 0; - drbg_string_t data; - unsigned char prefix = DRBG_PREFIX1; memset (drbg->scratchpad, 0, drbg_blocklen (drbg)); @@ -973,24 +988,9 @@ drbg_ctr_generate (drbg_state_t drbg, } /* 10.2.1.5.2 step 4.1 */ - drbg_add_buf (drbg->V, drbg_blocklen (drbg), &prefix, 1); - drbg_string_fill (&data, drbg->V, drbg_blocklen (drbg)); - while (len < buflen) - { - unsigned int outlen = 0; - /* 10.2.1.5.2 step 4.2 */ - ret = drbg_sym (drbg, drbg->C, drbg->scratchpad, &data); - if (ret) - goto out; - outlen = (drbg_blocklen (drbg) < (buflen - len)) ? - drbg_blocklen (drbg) : (buflen - len); - /* 10.2.1.5.2 step 4.3 */ - memcpy (buf + len, drbg->scratchpad, outlen); - len += outlen; - /* 10.2.1.5.2 step 6 */ - if (len < buflen) - drbg_add_buf (drbg->V, drbg_blocklen (drbg), &prefix, 1); - } + ret = drbg_sym_ctr (drbg, drbg->ctr_null, DRBG_CTR_NULL_LEN, buf, buflen); + if (ret) + goto out; /* 10.2.1.5.2 step 6 */ if (addtl) @@ -998,13 +998,14 @@ drbg_ctr_generate (drbg_state_t drbg, ret = drbg_ctr_update (drbg, addtl, 3); out: - memset (drbg->scratchpad, 0, drbg_blocklen (drbg)); return ret; } static struct drbg_state_ops_s drbg_ctr_ops = { drbg_ctr_update, - drbg_ctr_generate + drbg_ctr_generate, + drbg_sym_init, + drbg_sym_fini, }; /****************************************************************** @@ -1023,6 +1024,9 @@ drbg_hmac_update (drbg_state_t drbg, drbg_string_t *seed, int reseed) /* 10.1.2.3 step 2 already implicitly covered with * the initial memset(0) of drbg->C */ memset (drbg->V, 1, drbg_statelen (drbg)); + ret = drbg_hmac_setkey (drbg, drbg->C); + if (ret) + return ret; } /* build linked list which implements the concatenation and fill @@ -1044,12 +1048,16 @@ drbg_hmac_update (drbg_state_t drbg, drbg_string_t *seed, int reseed) prefix = DRBG_PREFIX1; /* 10.1.2.2 step 1 and 4 -- concatenation and HMAC for key */ seed2.buf = &prefix; - ret = drbg_hmac (drbg, drbg->C, drbg->C, &seed1); + ret = drbg_hash (drbg, drbg->C, &seed1); + if (ret) + return ret; + + ret = drbg_hmac_setkey (drbg, drbg->C); if (ret) return ret; /* 10.1.2.2 step 2 and 5 -- HMAC for V */ - ret = drbg_hmac (drbg, drbg->C, drbg->V, &cipherin); + ret = drbg_hash (drbg, drbg->V, &cipherin); if (ret) return ret; @@ -1083,7 +1091,7 @@ drbg_hmac_generate (drbg_state_t drbg, unsigned char *buf, unsigned int buflen, { unsigned int outlen = 0; /* 10.1.2.5 step 4.1 */ - ret = drbg_hmac (drbg, drbg->C, drbg->V, &data); + ret = drbg_hash (drbg, drbg->V, &data); if (ret) return ret; outlen = (drbg_blocklen (drbg) < (buflen - len)) ? @@ -1104,7 +1112,9 @@ drbg_hmac_generate (drbg_state_t drbg, unsigned char *buf, unsigned int buflen, static struct drbg_state_ops_s drbg_hmac_ops = { drbg_hmac_update, - drbg_hmac_generate + drbg_hmac_generate, + drbg_hmac_init, + drbg_hash_fini, }; /****************************************************************** @@ -1148,7 +1158,7 @@ drbg_hash_df (drbg_state_t drbg, { short blocklen = 0; /* 10.4.1 step 4.1 */ - ret = drbg_hmac (drbg, NULL, tmp, &data1); + ret = drbg_hash (drbg, tmp, &data1); if (ret) goto out; /* 10.4.1 step 4.2 */ @@ -1237,13 +1247,13 @@ drbg_hash_process_addtl (drbg_state_t drbg, drbg_string_t *addtl) data2.next = data3; data3->next = NULL; /* 10.1.1.4 step 2a -- cipher invocation */ - ret = drbg_hmac (drbg, NULL, drbg->scratchpad, &data1); + ret = drbg_hash (drbg, drbg->scratchpad, &data1); if (ret) goto out; /* 10.1.1.4 step 2b */ drbg_add_buf (drbg->V, drbg_statelen (drbg), - drbg->scratchpad, drbg_blocklen (drbg)); + drbg->scratchpad, drbg_blocklen (drbg)); out: memset (drbg->scratchpad, 0, drbg_blocklen (drbg)); @@ -1276,7 +1286,7 @@ drbg_hash_hashgen (drbg_state_t drbg, { unsigned int outlen = 0; /* 10.1.1.4 step hashgen 4.1 */ - ret = drbg_hmac (drbg, NULL, dst, &data); + ret = drbg_hash (drbg, dst, &data); if (ret) goto out; outlen = (drbg_blocklen (drbg) < (buflen - len)) ? @@ -1330,7 +1340,7 @@ drbg_hash_generate (drbg_state_t drbg, drbg_string_fill (&data1, &prefix, 1); drbg_string_fill (&data2, drbg->V, drbg_statelen (drbg)); data1.next = &data2; - ret = drbg_hmac (drbg, NULL, drbg->scratchpad, &data1); + ret = drbg_hash (drbg, drbg->scratchpad, &data1); if (ret) goto out; @@ -1354,7 +1364,9 @@ drbg_hash_generate (drbg_state_t drbg, */ static struct drbg_state_ops_s drbg_hash_ops = { drbg_hash_update, - drbg_hash_generate + drbg_hash_generate, + drbg_hash_init, + drbg_hash_fini, }; /****************************************************************** @@ -1599,6 +1611,7 @@ drbg_uninstantiate (drbg_state_t drbg) { if (!drbg) return GPG_ERR_INV_ARG; + drbg->d_ops->crypto_fini(drbg); xfree (drbg->V); drbg->V = NULL; xfree (drbg->C); @@ -1666,13 +1679,16 @@ drbg_instantiate (drbg_state_t drbg, /* 9.1 step 4 is implicit in drbg_sec_strength */ - /* no allocation of drbg as this is done by the kernel crypto API */ + ret = drbg->d_ops->crypto_init(drbg); + if (ret) + goto err; + drbg->V = xcalloc_secure (1, drbg_statelen (drbg)); if (!drbg->V) - goto err; + goto fini; drbg->C = xcalloc_secure (1, drbg_statelen (drbg)); if (!drbg->C) - goto err; + goto fini; /* scratchpad is only generated for CTR and Hash */ if (drbg->core->flags & DRBG_HMAC) sb_size = 0; @@ -1689,19 +1705,21 @@ drbg_instantiate (drbg_state_t drbg, { drbg->scratchpad = xcalloc_secure (1, sb_size); if (!drbg->scratchpad) - goto err; + goto fini; } dbg (("DRBG: state allocated with scratchpad size %u bytes\n", sb_size)); /* 9.1 step 6 through 11 */ ret = drbg_seed (drbg, pers, 0); if (ret) - goto err; + goto fini; dbg (("DRBG: core %d %s prediction resistance successfully initialized\n", coreref, pr ? "with" : "without")); return 0; + fini: + drbg->d_ops->crypto_fini(drbg); err: drbg_uninstantiate (drbg); return ret; @@ -2563,59 +2581,160 @@ _gcry_rngdrbg_selftest (selftest_report_func_t report) ***************************************************************/ static gpg_err_code_t -drbg_hmac (drbg_state_t drbg, const unsigned char *key, - unsigned char *outval, const drbg_string_t *buf) +drbg_hash_init (drbg_state_t drbg) { + gcry_md_hd_t hd; gpg_error_t err; + + err = _gcry_md_open (&hd, drbg->core->backend_cipher, 0); + if (err) + return err; + + drbg->priv_data = hd; + + return 0; +} + +static gpg_err_code_t +drbg_hmac_init (drbg_state_t drbg) +{ gcry_md_hd_t hd; + gpg_error_t err; - if (key) - { - err = - _gcry_md_open (&hd, drbg->core->backend_cipher, GCRY_MD_FLAG_HMAC); - if (err) - return err; - err = _gcry_md_setkey (hd, key, drbg_statelen (drbg)); - if (err) - return err; - } - else - { - err = _gcry_md_open (&hd, drbg->core->backend_cipher, 0); - if (err) - return err; - } + err = _gcry_md_open (&hd, drbg->core->backend_cipher, GCRY_MD_FLAG_HMAC); + if (err) + return err; + + drbg->priv_data = hd; + + return 0; +} + +static gpg_err_code_t +drbg_hmac_setkey (drbg_state_t drbg, const unsigned char *key) +{ + gcry_md_hd_t hd = (gcry_md_hd_t)drbg->priv_data; + + return _gcry_md_setkey (hd, key, drbg_statelen (drbg)); +} + +static void +drbg_hash_fini (drbg_state_t drbg) +{ + gcry_md_hd_t hd = (gcry_md_hd_t)drbg->priv_data; + + _gcry_md_close (hd); +} + +static gpg_err_code_t +drbg_hash (drbg_state_t drbg, unsigned char *outval, const drbg_string_t *buf) +{ + gcry_md_hd_t hd = (gcry_md_hd_t)drbg->priv_data; + + _gcry_md_reset(hd); for (; NULL != buf; buf = buf->next) _gcry_md_write (hd, buf->buf, buf->len); _gcry_md_final (hd); memcpy (outval, _gcry_md_read (hd, drbg->core->backend_cipher), drbg_blocklen (drbg)); - _gcry_md_close (hd); return 0; } +static void +drbg_sym_fini (drbg_state_t drbg) +{ + gcry_cipher_hd_t hd = (gcry_cipher_hd_t)drbg->priv_data; + + if (hd) + _gcry_cipher_close (hd); + if (drbg->ctr_handle) + _gcry_cipher_close (drbg->ctr_handle); + if (drbg->ctr_null) + free(drbg->ctr_null); +} + static gpg_err_code_t -drbg_sym (drbg_state_t drbg, const unsigned char *key, - unsigned char *outval, const drbg_string_t *buf) +drbg_sym_init (drbg_state_t drbg) { - gpg_error_t err; gcry_cipher_hd_t hd; + gpg_error_t err; + + drbg->ctr_null = calloc(1, DRBG_CTR_NULL_LEN); + if (!drbg->ctr_null) + return GPG_ERR_ENOMEM; err = _gcry_cipher_open (&hd, drbg->core->backend_cipher, - GCRY_CIPHER_MODE_ECB, 0); + GCRY_CIPHER_MODE_ECB, 0); if (err) - return err; + { + drbg_sym_fini (drbg); + return err; + } + drbg->priv_data = hd; + + err = _gcry_cipher_open (&drbg->ctr_handle, drbg->core->backend_cipher, + GCRY_CIPHER_MODE_CTR, 0); + if (err) + { + drbg_sym_fini (drbg); + return err; + } + + if (drbg_blocklen (drbg) != _gcry_cipher_get_algo_blklen (drbg->core->backend_cipher)) - return -GPG_ERR_NO_ERROR; + { + drbg_sym_fini (drbg); + return -GPG_ERR_NO_ERROR; + } + + return 0; +} + +static gpg_err_code_t +drbg_sym_setkey (drbg_state_t drbg, const unsigned char *key) +{ + gcry_cipher_hd_t hd = (gcry_cipher_hd_t)drbg->priv_data; + + return _gcry_cipher_setkey (hd, key, drbg_keylen (drbg)); +} + +static gpg_err_code_t +drbg_sym (drbg_state_t drbg, unsigned char *outval, const drbg_string_t *buf) +{ + gcry_cipher_hd_t hd = (gcry_cipher_hd_t)drbg->priv_data; + + _gcry_cipher_reset(hd); if (drbg_blocklen (drbg) < buf->len) return -GPG_ERR_NO_ERROR; - err = _gcry_cipher_setkey (hd, key, drbg_keylen (drbg)); + /* in is only component */ + return _gcry_cipher_encrypt (hd, outval, drbg_blocklen (drbg), buf->buf, + buf->len); +} + +static gpg_err_code_t +drbg_sym_ctr (drbg_state_t drbg, + const unsigned char *inbuf, unsigned int inbuflen, + unsigned char *outbuf, unsigned int outbuflen) +{ + gpg_error_t err; + + _gcry_cipher_reset(drbg->ctr_handle); + err = _gcry_cipher_setctr(drbg->ctr_handle, drbg->V, drbg_blocklen (drbg)); if (err) return err; - /* in is only component */ - _gcry_cipher_encrypt (hd, outval, drbg_blocklen (drbg), buf->buf, - buf->len); - _gcry_cipher_close (hd); - return 0; + + while (outbuflen) + { + unsigned int cryptlen = (inbuflen > outbuflen) ? outbuflen : inbuflen; + + err = _gcry_cipher_encrypt (drbg->ctr_handle, outbuf, cryptlen, inbuf, + cryptlen); + if (err) + return err; + + outbuflen -= cryptlen; + outbuf += cryptlen; + } + return _gcry_cipher_getctr(drbg->ctr_handle, drbg->V, drbg_blocklen (drbg)); } commit 227099f179df9dcf083d0ef6be9883c775df0874 Author: Stephan Mueller Date: Thu Dec 1 17:11:42 2016 +0100 cipher: New function for reading the counter in CTR mode * cipher/cipher.c (gcry_cipher_getctr): New. -- The API call allows reading the current counter of the CTR mode. The API remains internal to libgcrypt and is not exported to external callers. Signed-off-by: Stephan Mueller ChangeLog entry above added by -wk diff --git a/cipher/cipher.c b/cipher/cipher.c index ff3340f..55853da 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -1117,6 +1117,16 @@ _gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen) return 0; } +gpg_err_code_t +_gcry_cipher_getctr (gcry_cipher_hd_t hd, void *ctr, size_t ctrlen) +{ + if (ctr && ctrlen == hd->spec->blocksize) + memcpy (ctr, hd->u_ctr.ctr, hd->spec->blocksize); + else + return GPG_ERR_INV_ARG; + + return 0; +} gcry_err_code_t _gcry_cipher_authenticate (gcry_cipher_hd_t hd, const void *abuf, diff --git a/src/gcrypt-int.h b/src/gcrypt-int.h index 729f54a..ef5337b 100644 --- a/src/gcrypt-int.h +++ b/src/gcrypt-int.h @@ -77,6 +77,8 @@ gpg_err_code_t _gcry_cipher_checktag (gcry_cipher_hd_t hd, const void *intag, size_t taglen); gpg_err_code_t _gcry_cipher_setctr (gcry_cipher_hd_t hd, const void *ctr, size_t ctrlen); +gpg_err_code_t _gcry_cipher_getctr (gcry_cipher_hd_t hd, + void *ctr, size_t ctrlen); size_t _gcry_cipher_get_algo_keylen (int algo); size_t _gcry_cipher_get_algo_blklen (int algo); commit df8634d8e2b595430dc1e6575a7452c242cffca1 Author: Stephan Mueller Date: Sun Nov 27 10:14:21 2016 +0100 doc: Remove comment that is not applicable any more. -- Signed-off-by: Stephan Mueller diff --git a/random/random-drbg.c b/random/random-drbg.c index f9d11a3..9676f0e 100644 --- a/random/random-drbg.c +++ b/random/random-drbg.c @@ -899,8 +899,6 @@ drbg_ctr_update (drbg_state_t drbg, drbg_string_t *addtl, int reseed) memset (df_data, 0, drbg_statelen (drbg)); /* 10.2.1.3.2 step 2 and 10.2.1.4.2 step 2 */ - /* TODO use reseed variable to avoid re-doing DF operation */ - (void) reseed; if (addtl && 0 < addtl->len) { ret = commit 677ddf5bbd9c172a72607c7d5d7006907071c2cf Author: Werner Koch Date: Wed Dec 7 18:55:06 2016 +0100 doc: Update NEWS. -- diff --git a/NEWS b/NEWS index 0aaf863..722172a 100644 --- a/NEWS +++ b/NEWS @@ -15,6 +15,14 @@ Noteworthy changes in version 1.8.0 (unreleased) [C21/A1/R_] blocking read of /dev/random. This allows other nPth threads to run while Libgcrypt is gathering entropy. + - When secure memory is requested by the MPI functions or by + gcry_xmalloc_secure, they do not anymore lead to a fatal error if + the secure memory pool is used up. Instead new pools are + allocated as needed. These new pools are not protected against + being swapped out (mlock can't be used). However, these days + this is considered a minor issue and can easily be mitigated by + using encrypted swap space. + * Interface changes relative to the 1.6.0 release: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ----------------------------------------------------------------------- Summary of changes: NEWS | 8 + cipher/cipher.c | 10 ++ random/random-drbg.c | 421 ++++++++++++++++++++++++++++++--------------------- src/gcrypt-int.h | 2 + 4 files changed, 266 insertions(+), 175 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From wk at gnupg.org Thu Dec 8 09:15:50 2016 From: wk at gnupg.org (Werner Koch) Date: Thu, 08 Dec 2016 09:15:50 +0100 Subject: [PATCH] DRBG: eliminate unneeded memcpy invocations In-Reply-To: <1876399.Mazi4n7g1N@positron.chronox.de> (Stephan Mueller's message of "Sat, 03 Dec 2016 19:18:01 +0100") References: <1876399.Mazi4n7g1N@positron.chronox.de> Message-ID: <87y3zqltu1.fsf@wheatstone.g10code.de> On Sat, 3 Dec 2016 19:18, smueller at chronox.de said: > This patch goes on top of the patch set I sent 2 days ago. Thanks for the patches. I applied all of them. However, I had to add the ChangeLog entries to the commit log. I did this only exceptionally; so please for the next patches look at existing commit messages to learn how we like to format the commit messages. See gnupg/doc/HACKING or for a full description. Salam-Shalom, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: From smueller at chronox.de Thu Dec 8 09:23:45 2016 From: smueller at chronox.de (Stephan =?ISO-8859-1?Q?M=FCller?=) Date: Thu, 08 Dec 2016 09:23:45 +0100 Subject: [PATCH] DRBG: eliminate unneeded memcpy invocations In-Reply-To: <87y3zqltu1.fsf@wheatstone.g10code.de> References: <1876399.Mazi4n7g1N@positron.chronox.de> <87y3zqltu1.fsf@wheatstone.g10code.de> Message-ID: <64402321.EtlmADrHoR@tauon.atsec.com> Am Donnerstag, 8. Dezember 2016, 09:15:50 CET schrieb Werner Koch: Hi Werner, > On Sat, 3 Dec 2016 19:18, smueller at chronox.de said: > > This patch goes on top of the patch set I sent 2 days ago. > > Thanks for the patches. I applied all of them. > > However, I had to add the ChangeLog entries to the commit log. I did > this only exceptionally; so please for the next patches look at existing > commit messages to learn how we like to format the commit messages. See > gnupg/doc/HACKING or for a full > description. Thank you for adding them an pointing that out. I will follow that guidance next time. Ciao Stephan From wk at gnupg.org Fri Dec 9 12:13:02 2016 From: wk at gnupg.org (Werner Koch) Date: Fri, 09 Dec 2016 12:13:02 +0100 Subject: [PATCH 1/2] mlock incorrectly marked as broken on FreeBSD In-Reply-To: <20160811140835.23510-1-ruben@rubenkerkhof.com> (Ruben Kerkhof's message of "Thu, 11 Aug 2016 16:08:34 +0200") References: <20160811140835.23510-1-ruben@rubenkerkhof.com> Message-ID: <871sxhice9.fsf@wheatstone.g10code.de> On Thu, 11 Aug 2016 16:08, ruben at rubenkerkhof.com said: > On FreeBSD, if there are not enough free pages, mlock() can return > EAGAIN, as documented in mlock(2). That doesn't mean that mlock is > broken. I suspect this same issue also exists on the other BSD's. Actually the Linux man pages says the same now. Thanks for the patch I applied both of them. Salam-Shalom, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: From cvs at cvs.gnupg.org Fri Dec 9 15:26:01 2016 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Fri, 09 Dec 2016 15:26:01 +0100 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.7.3-33-g618b897 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 618b8978f46f4011c11512fd5f30c15e01652e2e (commit) from 656395ba4cf34f42dda3a120bda3ed1220755a3d (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 618b8978f46f4011c11512fd5f30c15e01652e2e Author: Werner Koch Date: Fri Dec 9 12:10:54 2016 +0100 Improve handling of mlock error codes. * acinclude.m4 (GNUPG_CHECK_MLOCK): Check also for EAGAIN which is a legitimate return code and does not indicate a broken mlock(). * src/secmem.c (lock_pool_pages): Test ERR instead of ERRNO which could have been overwritten by cap_from+text et al. -- On FreeBSD, if there are not enough free pages, mlock() can return EAGAIN, as documented in mlock(2). That doesn't mean that mlock is broken. I suspect this same issue also exists on the other BSD's. Suggested-by: Ruben Kerkhof This is (now) also true for Linux. Signed-off-by: Werner Koch diff --git a/acinclude.m4 b/acinclude.m4 index 96be833..90b3cb9 100644 --- a/acinclude.m4 +++ b/acinclude.m4 @@ -242,7 +242,7 @@ int main() pool += (pgsize - ((long int)pool % pgsize)); err = mlock( pool, 4096 ); - if( !err || errno == EPERM ) + if( !err || errno == EPERM || errno == EGAIN) return 0; /* okay */ return 1; /* hmmm */ diff --git a/src/secmem.c b/src/secmem.c index 4fa267b..55424f2 100644 --- a/src/secmem.c +++ b/src/secmem.c @@ -282,15 +282,15 @@ lock_pool_pages (void *p, size_t n) if (err) { - if (errno != EPERM -#ifdef EAGAIN /* OpenBSD returns this */ - && errno != EAGAIN + if (err != EPERM +#ifdef EAGAIN /* BSD and also Linux may return EAGAIN */ + && err != EAGAIN #endif #ifdef ENOSYS /* Some SCOs return this (function not implemented) */ - && errno != ENOSYS + && err != ENOSYS #endif #ifdef ENOMEM /* Linux might return this. */ - && errno != ENOMEM + && err != ENOMEM #endif ) log_error ("can't lock memory: %s\n", strerror (err)); @@ -341,15 +341,15 @@ lock_pool_pages (void *p, size_t n) if (err) { - if (errno != EPERM -#ifdef EAGAIN /* OpenBSD returns this. */ - && errno != EAGAIN + if (err != EPERM +#ifdef EAGAIN /* BSD and also Linux may return this. */ + && err != EAGAIN #endif #ifdef ENOSYS /* Some SCOs return this (function not implemented). */ - && errno != ENOSYS + && err != ENOSYS #endif #ifdef ENOMEM /* Linux might return this. */ - && errno != ENOMEM + && err != ENOMEM #endif ) log_error ("can't lock memory: %s\n", strerror (err)); ----------------------------------------------------------------------- Summary of changes: acinclude.m4 | 2 +- src/secmem.c | 20 ++++++++++---------- 2 files changed, 11 insertions(+), 11 deletions(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Fri Dec 9 16:00:19 2016 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Fri, 09 Dec 2016 16:00:19 +0100 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.7.3-34-ge232c12 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via e232c12efac539b2fd2a4472259693b4364a349e (commit) from 618b8978f46f4011c11512fd5f30c15e01652e2e (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit e232c12efac539b2fd2a4472259693b4364a349e Author: Werner Koch Date: Fri Dec 9 15:57:33 2016 +0100 Add release info from 1.7.4 -- diff --git a/NEWS b/NEWS index 722172a..146e208 100644 --- a/NEWS +++ b/NEWS @@ -28,6 +28,41 @@ Noteworthy changes in version 1.8.0 (unreleased) [C21/A1/R_] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ GCRYCTL_REINIT_SYSCALL_CLAMP NEW macro. + +Noteworthy changes in version 1.7.4 (2016-12-09) [C21/A1/R4] +------------------------------------------------ + + * Performance: + + - More ARMv8/AArch32 improvements for AES, GCM, SHA-256, and SHA-1. + + - Add ARMv8/AArch32 assembly implementation for Twofish and + Camellia. + + - Add bulk processing implementation for ARMv8/AArch32. + + - Add Stribog OIDs. + + - Improve the DRBG performance and sync the code with the Linux + version. + + * Internal changes: + + - When secure memory is requested by the MPI functions or by + gcry_xmalloc_secure, they do not anymore lead to a fatal error if + the secure memory pool is used up. Instead new pools are + allocated as needed. These new pools are not protected against + being swapped out (mlock can't be used). However, these days + this is considered a minor issue and can easily be mitigated by + using encrypted swap space. + + * Bug fixes: + + - Fix GOST 28147 CryptoPro-B S-box. + + - Fix error code handling of mlock calls. + + Noteworthy changes in version 1.7.3 (2016-08-17) [C21/A1/R3] ------------------------------------------------ ----------------------------------------------------------------------- Summary of changes: NEWS | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Sat Dec 10 12:13:48 2016 From: cvs at cvs.gnupg.org (by Jussi Kivilinna) Date: Sat, 10 Dec 2016 12:13:48 +0100 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.7.3-40-gc83d0d2 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via c83d0d2a26059cf471d09f5cb8e7fc5d76c4907b (commit) via 2b7b227b8a0bd5ff286258bc187782efac180a7e (commit) via 5c418e597f0f20a546d953161695e6caf1f57689 (commit) via 2d2e5286d53e1f62fe040dff4c6e01961f00afe2 (commit) via 161d339f48c03be7fd0f4249d730f7f1767ef8e4 (commit) via 0b03b658bebc69a84d87ef13f9b60a27b0c42305 (commit) from e232c12efac539b2fd2a4472259693b4364a349e (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit c83d0d2a26059cf471d09f5cb8e7fc5d76c4907b Author: Jussi Kivilinna Date: Sat Dec 10 12:29:12 2016 +0200 hwfeatures: add 'all' for disabling all hardware features * .gitignore: Add 'tests/basic-disable-all-hwf'. * configure.ac: Ditto. * tests/Makefile.am: Ditto. * src/hwfeatures.c (_gcry_disable_hw_feature): Match 'all' for masking all HW features off. (parse_hwf_deny_file): Use '_gcry_disable_hw_feature' for matching. * tests/basic-disable-all-hwf.in: New. -- Also add new test to run 'basic' with all HWF disable. With current assembly implementations and build servers using new CPUs, generic implementations are not being tested enough anymore and compiler problems might end up unnoticed. Signed-off-by: Jussi Kivilinna diff --git a/.gitignore b/.gitignore index 3cd83a2..5d481aa 100644 --- a/.gitignore +++ b/.gitignore @@ -73,6 +73,7 @@ tests/ac-data tests/ac-schemes tests/aeswrap tests/basic +tests/basic-disable-all-hwf tests/bench-slope tests/benchmark tests/curves diff --git a/configure.ac b/configure.ac index 17ff407..91562a9 100644 --- a/configure.ac +++ b/configure.ac @@ -2555,6 +2555,7 @@ src/versioninfo.rc tests/Makefile ]) AC_CONFIG_FILES([tests/hashtest-256g], [chmod +x tests/hashtest-256g]) +AC_CONFIG_FILES([tests/basic-disable-all-hwf], [chmod +x tests/basic-disable-all-hwf]) AC_OUTPUT diff --git a/src/hwfeatures.c b/src/hwfeatures.c index 07221e8..99aba34 100644 --- a/src/hwfeatures.c +++ b/src/hwfeatures.c @@ -83,6 +83,12 @@ _gcry_disable_hw_feature (const char *name) { int i; + if (!strcmp(name, "all")) + { + disabled_hw_features = ~0; + return 0; + } + for (i=0; i < DIM (hwflist); i++) if (!strcmp (hwflist[i].desc, name)) { @@ -159,15 +165,7 @@ parse_hwf_deny_file (void) if (!*p || *p == '#') continue; - for (i=0; i < DIM (hwflist); i++) - { - if (!strcmp (hwflist[i].desc, p)) - { - disabled_hw_features |= hwflist[i].flag; - break; - } - } - if (i == DIM (hwflist)) + if (_gcry_disable_hw_feature (p) == GPG_ERR_INV_NAME) { #ifdef HAVE_SYSLOG syslog (LOG_USER|LOG_WARNING, diff --git a/tests/Makefile.am b/tests/Makefile.am index 374e72e..db51cbd 100644 --- a/tests/Makefile.am +++ b/tests/Makefile.am @@ -26,7 +26,7 @@ tests_bin = \ tests_bin_last = benchmark bench-slope -tests_sh = +tests_sh = basic-disable-all-hwf tests_sh_last = hashtest-256g @@ -58,7 +58,8 @@ noinst_HEADERS = t-common.h EXTRA_DIST = README rsa-16k.key cavs_tests.sh cavs_driver.pl \ pkcs1v2-oaep.h pkcs1v2-pss.h pkcs1v2-v15c.h pkcs1v2-v15s.h \ t-ed25519.inp stopwatch.h hashtest-256g.in \ - sha3-224.h sha3-256.h sha3-384.h sha3-512.h + sha3-224.h sha3-256.h sha3-384.h sha3-512.h \ + basic-disable-all-hwf.in LDADD = $(standard_ldadd) $(GPG_ERROR_LIBS) t_lock_LDADD = $(standard_ldadd) $(GPG_ERROR_MT_LIBS) diff --git a/tests/basic-disable-all-hwf.in b/tests/basic-disable-all-hwf.in new file mode 100644 index 0000000..1f0a4de --- /dev/null +++ b/tests/basic-disable-all-hwf.in @@ -0,0 +1,4 @@ +#!/bin/sh + +echo " now running 'basic' test with all hardware features disabled." +exec ./basic at EXEEXT@ --disable-hwf all commit 2b7b227b8a0bd5ff286258bc187782efac180a7e Author: Jussi Kivilinna Date: Sat Dec 10 12:29:12 2016 +0200 tests/hashtest-256g: add missing executable extension for Win32 * tests/hashtest-256g.in: Add @EXEEXT at . -- Signed-off-by: Jussi Kivilinna diff --git a/tests/hashtest-256g.in b/tests/hashtest-256g.in index e897c54..92b1c1b 100755 --- a/tests/hashtest-256g.in +++ b/tests/hashtest-256g.in @@ -4,4 +4,4 @@ algos="SHA1 SHA256 SHA512" test "@RUN_LARGE_DATA_TESTS@" = yes || exit 77 echo " now running 256 GiB tests for $algos - this takes looong" -exec ./hashtest --gigs 256 $algos +exec ./hashtest at EXEEXT@ --gigs 256 $algos commit 5c418e597f0f20a546d953161695e6caf1f57689 Author: Jussi Kivilinna Date: Sat Dec 10 12:29:12 2016 +0200 OCB ARM CE: Move ocb_get_l handling to assembly part * cipher/rijndael-armv8-aarch32-ce.S: Add OCB 'L_{ntz(i)}' calculation. * cipher/rijndael-armv8-aarch64-ce.S: Ditto. * cipher/rijndael-armv8-ce.c (_gcry_aes_ocb_enc_armv8_ce) (_gcry_aes_ocb_dec_armv8_ce, _gcry_aes_ocb_auth_armv8_ce) (ocb_cryt_fn_t): Updated arguments. (_gcry_aes_armv8_ce_ocb_crypt, _gcry_aes_armv8_ce_ocb_auth): Remove 'ocb_get_l' handling and splitting input to 32 block chunks, instead pass full buffers to assembly. -- Performance on Cortex-A53 (AArch32): Before: AES | nanosecs/byte mebibytes/sec cycles/byte OCB enc | 1.63 ns/B 583.8 MiB/s 1.88 c/B OCB dec | 1.67 ns/B 572.1 MiB/s 1.92 c/B OCB auth | 1.33 ns/B 717.1 MiB/s 1.53 c/B After (~12% faster): AES | nanosecs/byte mebibytes/sec cycles/byte OCB enc | 1.47 ns/B 650.2 MiB/s 1.69 c/B OCB dec | 1.48 ns/B 644.5 MiB/s 1.70 c/B OCB auth | 1.19 ns/B 798.2 MiB/s 1.38 c/B Performance on Cortex-A53 (AArch64): Before: AES | nanosecs/byte mebibytes/sec cycles/byte OCB enc | 1.29 ns/B 738.5 MiB/s 1.49 c/B OCB dec | 1.32 ns/B 723.5 MiB/s 1.52 c/B OCB auth | 1.15 ns/B 827.0 MiB/s 1.33 c/B After (~8% faster): AES | nanosecs/byte mebibytes/sec cycles/byte OCB enc | 1.21 ns/B 789.1 MiB/s 1.39 c/B OCB dec | 1.21 ns/B 789.2 MiB/s 1.39 c/B OCB auth | 1.10 ns/B 867.0 MiB/s 1.27 c/B Signed-off-by: Jussi Kivilinna diff --git a/cipher/rijndael-armv8-aarch32-ce.S b/cipher/rijndael-armv8-aarch32-ce.S index bf68f20..f375f67 100644 --- a/cipher/rijndael-armv8-aarch32-ce.S +++ b/cipher/rijndael-armv8-aarch32-ce.S @@ -1021,9 +1021,10 @@ _gcry_aes_ctr_enc_armv8_ce: * const unsigned char *inbuf, * unsigned char *offset, * unsigned char *checksum, - * void **Ls, + * unsigned char *L_table, * size_t nblocks, - * unsigned int nrounds); + * unsigned int nrounds, + * unsigned int blkn); */ .align 3 @@ -1039,6 +1040,7 @@ _gcry_aes_ocb_enc_armv8_ce: * %st+4: Ls => r5 * %st+8: nblocks => r6 (0 < nblocks <= 32) * %st+12: nrounds => r7 + * %st+16: blkn => lr */ vpush {q4-q7} @@ -1047,6 +1049,7 @@ _gcry_aes_ocb_enc_armv8_ce: ldr r4, [sp, #(104+0)] ldr r5, [sp, #(104+4)] ldr r6, [sp, #(104+8)] + ldr lr, [sp, #(104+16)] cmp r7, #12 vld1.8 {q0}, [r3] /* load offset */ @@ -1059,6 +1062,7 @@ _gcry_aes_ocb_enc_armv8_ce: #define OCB_ENC(bits, ...) \ .Locb_enc_entry_##bits: \ cmp r6, #4; \ + add lr, #1; \ blo .Locb_enc_loop_##bits; \ \ .Locb_enc_loop4_##bits: \ @@ -1067,7 +1071,23 @@ _gcry_aes_ocb_enc_armv8_ce: /* Checksum_i = Checksum_{i-1} xor P_i */ \ /* C_i = Offset_i xor ENCIPHER(K, P_i xor Offset_i) */ \ \ - ldm r5!, {r8, r9, r10, r11}; \ + add r9, lr, #1; \ + add r10, lr, #2; \ + add r11, lr, #3; \ + rbit r8, lr; \ + add lr, lr, #4; \ + rbit r9, r9; \ + rbit r10, r10; \ + rbit r11, r11; \ + clz r8, r8; /* ntz(i+0) */ \ + clz r9, r9; /* ntz(i+1) */ \ + clz r10, r10; /* ntz(i+2) */ \ + clz r11, r11; /* ntz(i+3) */ \ + add r8, r5, r8, lsl #4; \ + add r9, r5, r9, lsl #4; \ + add r10, r5, r10, lsl #4; \ + add r11, r5, r11, lsl #4; \ + \ sub r6, #4; \ \ vld1.8 {q9}, [r8]; /* load L_{ntz(i+0)} */ \ @@ -1120,7 +1140,11 @@ _gcry_aes_ocb_enc_armv8_ce: /* Checksum_i = Checksum_{i-1} xor P_i */ \ /* C_i = Offset_i xor ENCIPHER(K, P_i xor Offset_i) */ \ \ - ldr r8, [r5], #4; \ + rbit r8, lr; \ + add lr, #1; \ + clz r8, r8; /* ntz(i) */ \ + add r8, r5, r8, lsl #4; \ + \ vld1.8 {q1}, [r2]!; /* load plaintext */ \ vld1.8 {q2}, [r8]; /* load L_{ntz(i)} */ \ vld1.8 {q3}, [r4]; /* load checksum */ \ @@ -1171,9 +1195,10 @@ _gcry_aes_ocb_enc_armv8_ce: * const unsigned char *inbuf, * unsigned char *offset, * unsigned char *checksum, - * void **Ls, + * unsigned char *L_table, * size_t nblocks, - * unsigned int nrounds); + * unsigned int nrounds, + * unsigned int blkn); */ .align 3 @@ -1189,6 +1214,7 @@ _gcry_aes_ocb_dec_armv8_ce: * %st+4: Ls => r5 * %st+8: nblocks => r6 (0 < nblocks <= 32) * %st+12: nrounds => r7 + * %st+16: blkn => lr */ vpush {q4-q7} @@ -1197,6 +1223,7 @@ _gcry_aes_ocb_dec_armv8_ce: ldr r4, [sp, #(104+0)] ldr r5, [sp, #(104+4)] ldr r6, [sp, #(104+8)] + ldr lr, [sp, #(104+16)] cmp r7, #12 vld1.8 {q0}, [r3] /* load offset */ @@ -1209,6 +1236,7 @@ _gcry_aes_ocb_dec_armv8_ce: #define OCB_DEC(bits, ...) \ .Locb_dec_entry_##bits: \ cmp r6, #4; \ + add lr, #1; \ blo .Locb_dec_loop_##bits; \ \ .Locb_dec_loop4_##bits: \ @@ -1217,7 +1245,23 @@ _gcry_aes_ocb_dec_armv8_ce: /* P_i = Offset_i xor DECIPHER(K, C_i xor Offset_i) */ \ /* Checksum_i = Checksum_{i-1} xor P_i */ \ \ - ldm r5!, {r8, r9, r10, r11}; \ + add r9, lr, #1; \ + add r10, lr, #2; \ + add r11, lr, #3; \ + rbit r8, lr; \ + add lr, lr, #4; \ + rbit r9, r9; \ + rbit r10, r10; \ + rbit r11, r11; \ + clz r8, r8; /* ntz(i+0) */ \ + clz r9, r9; /* ntz(i+1) */ \ + clz r10, r10; /* ntz(i+2) */ \ + clz r11, r11; /* ntz(i+3) */ \ + add r8, r5, r8, lsl #4; \ + add r9, r5, r9, lsl #4; \ + add r10, r5, r10, lsl #4; \ + add r11, r5, r11, lsl #4; \ + \ sub r6, #4; \ \ vld1.8 {q9}, [r8]; /* load L_{ntz(i+0)} */ \ @@ -1270,7 +1314,11 @@ _gcry_aes_ocb_dec_armv8_ce: /* P_i = Offset_i xor DECIPHER(K, C_i xor Offset_i) */ \ /* Checksum_i = Checksum_{i-1} xor P_i */ \ \ - ldr r8, [r5], #4; \ + rbit r8, lr; \ + add lr, #1; \ + clz r8, r8; /* ntz(i) */ \ + add r8, r5, r8, lsl #4; \ + \ vld1.8 {q2}, [r8]; /* load L_{ntz(i)} */ \ vld1.8 {q1}, [r2]!; /* load ciphertext */ \ subs r6, #1; \ @@ -1320,9 +1368,10 @@ _gcry_aes_ocb_dec_armv8_ce: * const unsigned char *abuf, * unsigned char *offset, * unsigned char *checksum, - * void **Ls, + * unsigned char *L_table, * size_t nblocks, - * unsigned int nrounds); + * unsigned int nrounds, + * unsigned int blkn); */ .align 3 @@ -1337,6 +1386,7 @@ _gcry_aes_ocb_auth_armv8_ce: * %st+0: Ls => r5 * %st+4: nblocks => r6 (0 < nblocks <= 32) * %st+8: nrounds => r7 + * %st+12: blkn => lr */ vpush {q4-q7} @@ -1344,6 +1394,7 @@ _gcry_aes_ocb_auth_armv8_ce: ldr r7, [sp, #(104+8)] ldr r5, [sp, #(104+0)] ldr r6, [sp, #(104+4)] + ldr lr, [sp, #(104+12)] cmp r7, #12 vld1.8 {q0}, [r2] /* load offset */ @@ -1356,6 +1407,7 @@ _gcry_aes_ocb_auth_armv8_ce: #define OCB_AUTH(bits, ...) \ .Locb_auth_entry_##bits: \ cmp r6, #4; \ + add lr, #1; \ blo .Locb_auth_loop_##bits; \ \ .Locb_auth_loop4_##bits: \ @@ -1363,7 +1415,23 @@ _gcry_aes_ocb_auth_armv8_ce: /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ \ /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ \ \ - ldm r5!, {r8, r9, r10, r11}; \ + add r9, lr, #1; \ + add r10, lr, #2; \ + add r11, lr, #3; \ + rbit r8, lr; \ + add lr, lr, #4; \ + rbit r9, r9; \ + rbit r10, r10; \ + rbit r11, r11; \ + clz r8, r8; /* ntz(i+0) */ \ + clz r9, r9; /* ntz(i+1) */ \ + clz r10, r10; /* ntz(i+2) */ \ + clz r11, r11; /* ntz(i+3) */ \ + add r8, r5, r8, lsl #4; \ + add r9, r5, r9, lsl #4; \ + add r10, r5, r10, lsl #4; \ + add r11, r5, r11, lsl #4; \ + \ sub r6, #4; \ \ vld1.8 {q9}, [r8]; /* load L_{ntz(i+0)} */ \ @@ -1401,8 +1469,12 @@ _gcry_aes_ocb_auth_armv8_ce: /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ \ /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ \ \ - ldr r8, [r5], #4; \ - vld1.8 {q2}, [r8]; /* load L_{ntz(i)} */ \ + rbit r8, lr; \ + add lr, #1; \ + clz r8, r8; /* ntz(i) */ \ + add r8, r5, r8, lsl #4; \ + \ + vld1.8 {q2}, [r8]; /* load L_{ntz(i)} */ \ vld1.8 {q1}, [r1]!; /* load aadtext */ \ subs r6, #1; \ veor q0, q0, q2; \ diff --git a/cipher/rijndael-armv8-aarch64-ce.S b/cipher/rijndael-armv8-aarch64-ce.S index 21d0aec..1ebb363 100644 --- a/cipher/rijndael-armv8-aarch64-ce.S +++ b/cipher/rijndael-armv8-aarch64-ce.S @@ -28,23 +28,6 @@ .text -#if (SIZEOF_VOID_P == 4) - #define ptr8 w8 - #define ptr9 w9 - #define ptr10 w10 - #define ptr11 w11 - #define ptr_sz 4 -#elif (SIZEOF_VOID_P == 8) - #define ptr8 x8 - #define ptr9 x9 - #define ptr10 x10 - #define ptr11 x11 - #define ptr_sz 8 -#else - #error "missing SIZEOF_VOID_P" -#endif - - #define GET_DATA_POINTER(reg, name) \ adrp reg, :got:name ; \ ldr reg, [reg, #:got_lo12:name] ; @@ -855,9 +838,10 @@ _gcry_aes_cfb_dec_armv8_ce: * const unsigned char *inbuf, * unsigned char *offset, * unsigned char *checksum, - * void **Ls, + * unsigned char *L_table, * size_t nblocks, - * unsigned int nrounds); + * unsigned int nrounds, + * unsigned int blkn); */ .align 3 @@ -870,11 +854,13 @@ _gcry_aes_ocb_enc_armv8_ce: * x2: inbuf * x3: offset * x4: checksum - * x5: Ls + * x5: Ltable * x6: nblocks (0 < nblocks <= 32) * w7: nrounds + * %st+0: blkn => w12 */ + ldr w12, [sp] ld1 {v0.16b}, [x3] /* load offset */ ld1 {v16.16b}, [x4] /* load checksum */ @@ -886,6 +872,7 @@ _gcry_aes_ocb_enc_armv8_ce: #define OCB_ENC(bits, ...) \ .Locb_enc_entry_##bits: \ cmp x6, #4; \ + add x12, x12, #1; \ b.lo .Locb_enc_loop_##bits; \ \ .Locb_enc_loop4_##bits: \ @@ -894,10 +881,24 @@ _gcry_aes_ocb_enc_armv8_ce: /* Checksum_i = Checksum_{i-1} xor P_i */ \ /* C_i = Offset_i xor ENCIPHER(K, P_i xor Offset_i) */ \ \ - ldp ptr8, ptr9, [x5], #(ptr_sz*2); \ + add w9, w12, #1; \ + add w10, w12, #2; \ + add w11, w12, #3; \ + rbit w8, w12; \ + add w12, w12, #4; \ + rbit w9, w9; \ + rbit w10, w10; \ + rbit w11, w11; \ + clz w8, w8; /* ntz(i+0) */ \ + clz w9, w9; /* ntz(i+1) */ \ + clz w10, w10; /* ntz(i+2) */ \ + clz w11, w11; /* ntz(i+3) */ \ + add x8, x5, x8, lsl #4; \ + ld1 {v1.16b-v4.16b}, [x2], #64; /* load P_i+<0-3> */ \ + add x9, x5, x9, lsl #4; \ + add x10, x5, x10, lsl #4; \ + add x11, x5, x11, lsl #4; \ \ - ld1 {v1.16b-v4.16b}, [x2], #64; /* load P_i+<0-3> */ \ - ldp ptr10, ptr11, [x5], #(ptr_sz*2); \ sub x6, x6, #4; \ \ ld1 {v5.16b}, [x8]; /* load L_{ntz(i+0)} */ \ @@ -940,7 +941,11 @@ _gcry_aes_ocb_enc_armv8_ce: /* Checksum_i = Checksum_{i-1} xor P_i */ \ /* C_i = Offset_i xor ENCIPHER(K, P_i xor Offset_i) */ \ \ - ldr ptr8, [x5], #(ptr_sz); \ + rbit x8, x12; \ + add x12, x12, #1; \ + clz x8, x8; /* ntz(i) */ \ + add x8, x5, x8, lsl #4; \ + \ ld1 {v1.16b}, [x2], #16; /* load plaintext */ \ ld1 {v2.16b}, [x8]; /* load L_{ntz(i)} */ \ sub x6, x6, #1; \ @@ -983,9 +988,10 @@ _gcry_aes_ocb_enc_armv8_ce: * const unsigned char *inbuf, * unsigned char *offset, * unsigned char *checksum, - * void **Ls, + * unsigned char *L_table, * size_t nblocks, - * unsigned int nrounds); + * unsigned int nrounds, + * unsigned int blkn); */ .align 3 @@ -998,11 +1004,13 @@ _gcry_aes_ocb_dec_armv8_ce: * x2: inbuf * x3: offset * x4: checksum - * x5: Ls + * x5: Ltable * x6: nblocks (0 < nblocks <= 32) * w7: nrounds + * %st+0: blkn => w12 */ + ldr w12, [sp] ld1 {v0.16b}, [x3] /* load offset */ ld1 {v16.16b}, [x4] /* load checksum */ @@ -1014,6 +1022,7 @@ _gcry_aes_ocb_dec_armv8_ce: #define OCB_DEC(bits) \ .Locb_dec_entry_##bits: \ cmp x6, #4; \ + add w12, w12, #1; \ b.lo .Locb_dec_loop_##bits; \ \ .Locb_dec_loop4_##bits: \ @@ -1022,10 +1031,24 @@ _gcry_aes_ocb_dec_armv8_ce: /* P_i = Offset_i xor DECIPHER(K, C_i xor Offset_i) */ \ /* Checksum_i = Checksum_{i-1} xor P_i */ \ \ - ldp ptr8, ptr9, [x5], #(ptr_sz*2); \ + add w9, w12, #1; \ + add w10, w12, #2; \ + add w11, w12, #3; \ + rbit w8, w12; \ + add w12, w12, #4; \ + rbit w9, w9; \ + rbit w10, w10; \ + rbit w11, w11; \ + clz w8, w8; /* ntz(i+0) */ \ + clz w9, w9; /* ntz(i+1) */ \ + clz w10, w10; /* ntz(i+2) */ \ + clz w11, w11; /* ntz(i+3) */ \ + add x8, x5, x8, lsl #4; \ + ld1 {v1.16b-v4.16b}, [x2], #64; /* load C_i+<0-3> */ \ + add x9, x5, x9, lsl #4; \ + add x10, x5, x10, lsl #4; \ + add x11, x5, x11, lsl #4; \ \ - ld1 {v1.16b-v4.16b}, [x2], #64; /* load C_i+<0-3> */ \ - ldp ptr10, ptr11, [x5], #(ptr_sz*2); \ sub x6, x6, #4; \ \ ld1 {v5.16b}, [x8]; /* load L_{ntz(i+0)} */ \ @@ -1068,7 +1091,11 @@ _gcry_aes_ocb_dec_armv8_ce: /* P_i = Offset_i xor DECIPHER(K, C_i xor Offset_i) */ \ /* Checksum_i = Checksum_{i-1} xor P_i */ \ \ - ldr ptr8, [x5], #(ptr_sz); \ + rbit w8, w12; \ + add w12, w12, #1; \ + clz w8, w8; /* ntz(i) */ \ + add x8, x5, x8, lsl #4; \ + \ ld1 {v1.16b}, [x2], #16; /* load ciphertext */ \ ld1 {v2.16b}, [x8]; /* load L_{ntz(i)} */ \ sub x6, x6, #1; \ @@ -1110,9 +1137,10 @@ _gcry_aes_ocb_dec_armv8_ce: * const unsigned char *abuf, * unsigned char *offset, * unsigned char *checksum, - * void **Ls, + * unsigned char *L_table, * size_t nblocks, - * unsigned int nrounds); + * unsigned int nrounds, + * unsigned int blkn); */ .align 3 @@ -1124,10 +1152,12 @@ _gcry_aes_ocb_auth_armv8_ce: * x1: abuf * x2: offset => x3 * x3: checksum => x4 - * x4: Ls => x5 + * x4: Ltable => x5 * x5: nblocks => x6 (0 < nblocks <= 32) * w6: nrounds => w7 + * w7: blkn => w12 */ + mov x12, x7 mov x7, x6 mov x6, x5 mov x5, x4 @@ -1145,6 +1175,7 @@ _gcry_aes_ocb_auth_armv8_ce: #define OCB_AUTH(bits) \ .Locb_auth_entry_##bits: \ cmp x6, #4; \ + add w12, w12, #1; \ b.lo .Locb_auth_loop_##bits; \ \ .Locb_auth_loop4_##bits: \ @@ -1152,10 +1183,24 @@ _gcry_aes_ocb_auth_armv8_ce: /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ \ /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ \ \ - ldp ptr8, ptr9, [x5], #(ptr_sz*2); \ + add w9, w12, #1; \ + add w10, w12, #2; \ + add w11, w12, #3; \ + rbit w8, w12; \ + add w12, w12, #4; \ + rbit w9, w9; \ + rbit w10, w10; \ + rbit w11, w11; \ + clz w8, w8; /* ntz(i+0) */ \ + clz w9, w9; /* ntz(i+1) */ \ + clz w10, w10; /* ntz(i+2) */ \ + clz w11, w11; /* ntz(i+3) */ \ + add x8, x5, x8, lsl #4; \ + ld1 {v1.16b-v4.16b}, [x1], #64; /* load A_i+<0-3> */ \ + add x9, x5, x9, lsl #4; \ + add x10, x5, x10, lsl #4; \ + add x11, x5, x11, lsl #4; \ \ - ld1 {v1.16b-v4.16b}, [x1], #64; /* load A_i+<0-3> */ \ - ldp ptr10, ptr11, [x5], #(ptr_sz*2); \ sub x6, x6, #4; \ \ ld1 {v5.16b}, [x8]; /* load L_{ntz(i+0)} */ \ @@ -1192,7 +1237,11 @@ _gcry_aes_ocb_auth_armv8_ce: /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ \ /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ \ \ - ldr ptr8, [x5], #(ptr_sz); \ + rbit w8, w12; \ + add w12, w12, #1; \ + clz w8, w8; /* ntz(i) */ \ + add x8, x5, x8, lsl #4; \ + \ ld1 {v1.16b}, [x1], #16; /* load aadtext */ \ ld1 {v2.16b}, [x8]; /* load L_{ntz(i)} */ \ sub x6, x6, #1; \ diff --git a/cipher/rijndael-armv8-ce.c b/cipher/rijndael-armv8-ce.c index 1bf74da..334cf68 100644 --- a/cipher/rijndael-armv8-ce.c +++ b/cipher/rijndael-armv8-ce.c @@ -80,30 +80,33 @@ extern void _gcry_aes_ocb_enc_armv8_ce (const void *keysched, const unsigned char *inbuf, unsigned char *offset, unsigned char *checksum, - void **Ls, + unsigned char *L_table, size_t nblocks, - unsigned int nrounds); + unsigned int nrounds, + unsigned int blkn); extern void _gcry_aes_ocb_dec_armv8_ce (const void *keysched, unsigned char *outbuf, const unsigned char *inbuf, unsigned char *offset, unsigned char *checksum, - void **Ls, + unsigned char *L_table, size_t nblocks, - unsigned int nrounds); + unsigned int nrounds, + unsigned int blkn); extern void _gcry_aes_ocb_auth_armv8_ce (const void *keysched, const unsigned char *abuf, unsigned char *offset, unsigned char *checksum, - void **Ls, + unsigned char *L_table, size_t nblocks, - unsigned int nrounds); + unsigned int nrounds, + unsigned int blkn); typedef void (*ocb_crypt_fn_t) (const void *keysched, unsigned char *outbuf, const unsigned char *inbuf, unsigned char *offset, unsigned char *checksum, - void **Ls, size_t nblocks, - unsigned int nrounds); + unsigned char *L_table, size_t nblocks, + unsigned int nrounds, unsigned int blkn); void _gcry_aes_armv8_ce_setkey (RIJNDAEL_context *ctx, const byte *key) @@ -334,62 +337,11 @@ _gcry_aes_armv8_ce_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, const unsigned char *inbuf = inbuf_arg; unsigned int nrounds = ctx->rounds; u64 blkn = c->u_mode.ocb.data_nblocks; - u64 blkn_offs = blkn - blkn % 32; - unsigned int n = 32 - blkn % 32; - void *Ls[32]; - void **l; - size_t i; c->u_mode.ocb.data_nblocks = blkn + nblocks; - if (nblocks >= 32) - { - for (i = 0; i < 32; i += 8) - { - Ls[(i + 0 + n) % 32] = (void *)c->u_mode.ocb.L[0]; - Ls[(i + 1 + n) % 32] = (void *)c->u_mode.ocb.L[1]; - Ls[(i + 2 + n) % 32] = (void *)c->u_mode.ocb.L[0]; - Ls[(i + 3 + n) % 32] = (void *)c->u_mode.ocb.L[2]; - Ls[(i + 4 + n) % 32] = (void *)c->u_mode.ocb.L[0]; - Ls[(i + 5 + n) % 32] = (void *)c->u_mode.ocb.L[1]; - Ls[(i + 6 + n) % 32] = (void *)c->u_mode.ocb.L[0]; - } - - Ls[(7 + n) % 32] = (void *)c->u_mode.ocb.L[3]; - Ls[(15 + n) % 32] = (void *)c->u_mode.ocb.L[4]; - Ls[(23 + n) % 32] = (void *)c->u_mode.ocb.L[3]; - l = &Ls[(31 + n) % 32]; - - /* Process data in 32 block chunks. */ - while (nblocks >= 32) - { - blkn_offs += 32; - *l = (void *)ocb_get_l(c, blkn_offs); - - crypt_fn(keysched, outbuf, inbuf, c->u_iv.iv, c->u_ctr.ctr, Ls, 32, - nrounds); - - nblocks -= 32; - outbuf += 32 * 16; - inbuf += 32 * 16; - } - - if (nblocks && l < &Ls[nblocks]) - { - *l = (void *)ocb_get_l(c, 32 + blkn_offs); - } - } - else - { - for (i = 0; i < nblocks; i++) - Ls[i] = (void *)ocb_get_l(c, ++blkn); - } - - if (nblocks) - { - crypt_fn(keysched, outbuf, inbuf, c->u_iv.iv, c->u_ctr.ctr, Ls, nblocks, - nrounds); - } + crypt_fn(keysched, outbuf, inbuf, c->u_iv.iv, c->u_ctr.ctr, + c->u_mode.ocb.L[0], nblocks, nrounds, (unsigned int)blkn); } void @@ -401,61 +353,12 @@ _gcry_aes_armv8_ce_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, const unsigned char *abuf = abuf_arg; unsigned int nrounds = ctx->rounds; u64 blkn = c->u_mode.ocb.aad_nblocks; - u64 blkn_offs = blkn - blkn % 32; - unsigned int n = 32 - blkn % 32; - void *Ls[32]; - void **l; - size_t i; c->u_mode.ocb.aad_nblocks = blkn + nblocks; - if (nblocks >= 32) - { - for (i = 0; i < 32; i += 8) - { - Ls[(i + 0 + n) % 32] = (void *)c->u_mode.ocb.L[0]; - Ls[(i + 1 + n) % 32] = (void *)c->u_mode.ocb.L[1]; - Ls[(i + 2 + n) % 32] = (void *)c->u_mode.ocb.L[0]; - Ls[(i + 3 + n) % 32] = (void *)c->u_mode.ocb.L[2]; - Ls[(i + 4 + n) % 32] = (void *)c->u_mode.ocb.L[0]; - Ls[(i + 5 + n) % 32] = (void *)c->u_mode.ocb.L[1]; - Ls[(i + 6 + n) % 32] = (void *)c->u_mode.ocb.L[0]; - } - - Ls[(7 + n) % 32] = (void *)c->u_mode.ocb.L[3]; - Ls[(15 + n) % 32] = (void *)c->u_mode.ocb.L[4]; - Ls[(23 + n) % 32] = (void *)c->u_mode.ocb.L[3]; - l = &Ls[(31 + n) % 32]; - - /* Process data in 32 block chunks. */ - while (nblocks >= 32) - { - blkn_offs += 32; - *l = (void *)ocb_get_l(c, blkn_offs); - - _gcry_aes_ocb_auth_armv8_ce(keysched, abuf, c->u_mode.ocb.aad_offset, - c->u_mode.ocb.aad_sum, Ls, 32, nrounds); - - nblocks -= 32; - abuf += 32 * 16; - } - - if (nblocks && l < &Ls[nblocks]) - { - *l = (void *)ocb_get_l(c, 32 + blkn_offs); - } - } - else - { - for (i = 0; i < nblocks; i++) - Ls[i] = (void *)ocb_get_l(c, ++blkn); - } - - if (nblocks) - { - _gcry_aes_ocb_auth_armv8_ce(keysched, abuf, c->u_mode.ocb.aad_offset, - c->u_mode.ocb.aad_sum, Ls, nblocks, nrounds); - } + _gcry_aes_ocb_auth_armv8_ce(keysched, abuf, c->u_mode.ocb.aad_offset, + c->u_mode.ocb.aad_sum, c->u_mode.ocb.L[0], + nblocks, nrounds, (unsigned int)blkn); } #endif /* USE_ARM_CE */ commit 2d2e5286d53e1f62fe040dff4c6e01961f00afe2 Author: Jussi Kivilinna Date: Sat Dec 10 12:29:12 2016 +0200 OCB: Move large L handling from bottom to upper level * cipher/cipher-ocb.c (_gcry_cipher_ocb_get_l): Remove. (ocb_get_L_big): New. (_gcry_cipher_ocb_authenticate): L-big handling done in upper processing loop, so that lower level never sees the case where 'aad_nblocks % 65536 == 0'; Add missing stack burn. (ocb_aad_finalize): Add missing stack burn. (ocb_crypt): L-big handling done in upper processing loop, so that lower level never sees the case where 'data_nblocks % 65536 == 0'. * cipher/cipher-internal.h (_gcry_cipher_ocb_get_l): Remove. (ocb_get_l): Remove 'l_tmp' usage and simplify since input is more limited now, 'N is not multiple of 65536'. * cipher/rijndael-aesni.c (get_l): Remove. (aesni_ocb_enc, aesni_ocb_dec, _gcry_aes_aesni_ocb_auth): Remove l_tmp; Use 'ocb_get_l'. * cipher/rijndael-ssse3-amd64.c (get_l): Remove. (ssse3_ocb_enc, ssse3_ocb_dec, _gcry_aes_ssse3_ocb_auth): Remove l_tmp; Use 'ocb_get_l'. * cipher/camellia-glue.c: Remove OCB l_tmp usage. * cipher/rijndael-armv8-ce.c: Ditto. * cipher/rijndael.c: Ditto. * cipher/serpent.c: Ditto. * cipher/twofish.c: Ditto. -- Move large L value generation to up-most level to simplify lower level ocb_get_l for greater performance and simpler implementation. This helps implementing OCB in assembly as 'ocb_get_l' no longer has function call on slow-path. Signed-off-by: Jussi Kivilinna diff --git a/cipher/camellia-glue.c b/cipher/camellia-glue.c index 1be35c9..7687094 100644 --- a/cipher/camellia-glue.c +++ b/cipher/camellia-glue.c @@ -619,7 +619,6 @@ _gcry_camellia_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, CAMELLIA_context *ctx = (void *)&c->context.c; unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; - unsigned char l_tmp[CAMELLIA_BLOCK_SIZE]; int burn_stack_depth; u64 blkn = c->u_mode.ocb.data_nblocks; @@ -664,9 +663,8 @@ _gcry_camellia_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, /* Process data in 32 block chunks. */ while (nblocks >= 32) { - /* l_tmp will be used only every 65536-th block. */ blkn += 32; - *l = (uintptr_t)(void *)ocb_get_l(c, l_tmp, blkn - blkn % 32); + *l = (uintptr_t)(void *)ocb_get_l(c, blkn - blkn % 32); if (encrypt) _gcry_camellia_aesni_avx2_ocb_enc(ctx, outbuf, inbuf, c->u_iv.iv, @@ -725,9 +723,8 @@ _gcry_camellia_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, /* Process data in 16 block chunks. */ while (nblocks >= 16) { - /* l_tmp will be used only every 65536-th block. */ blkn += 16; - *l = (uintptr_t)(void *)ocb_get_l(c, l_tmp, blkn - blkn % 16); + *l = (uintptr_t)(void *)ocb_get_l(c, blkn - blkn % 16); if (encrypt) _gcry_camellia_aesni_avx_ocb_enc(ctx, outbuf, inbuf, c->u_iv.iv, @@ -759,8 +756,6 @@ _gcry_camellia_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, #if defined(USE_AESNI_AVX) || defined(USE_AESNI_AVX2) c->u_mode.ocb.data_nblocks = blkn; - wipememory(&l_tmp, sizeof(l_tmp)); - if (burn_stack_depth) _gcry_burn_stack (burn_stack_depth + 4 * sizeof(void *)); #endif @@ -776,7 +771,6 @@ _gcry_camellia_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, #if defined(USE_AESNI_AVX) || defined(USE_AESNI_AVX2) CAMELLIA_context *ctx = (void *)&c->context.c; const unsigned char *abuf = abuf_arg; - unsigned char l_tmp[CAMELLIA_BLOCK_SIZE]; int burn_stack_depth; u64 blkn = c->u_mode.ocb.aad_nblocks; @@ -818,9 +812,8 @@ _gcry_camellia_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, /* Process data in 32 block chunks. */ while (nblocks >= 32) { - /* l_tmp will be used only every 65536-th block. */ blkn += 32; - *l = (uintptr_t)(void *)ocb_get_l(c, l_tmp, blkn - blkn % 32); + *l = (uintptr_t)(void *)ocb_get_l(c, blkn - blkn % 32); _gcry_camellia_aesni_avx2_ocb_auth(ctx, abuf, c->u_mode.ocb.aad_offset, @@ -875,9 +868,8 @@ _gcry_camellia_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, /* Process data in 16 block chunks. */ while (nblocks >= 16) { - /* l_tmp will be used only every 65536-th block. */ blkn += 16; - *l = (uintptr_t)(void *)ocb_get_l(c, l_tmp, blkn - blkn % 16); + *l = (uintptr_t)(void *)ocb_get_l(c, blkn - blkn % 16); _gcry_camellia_aesni_avx_ocb_auth(ctx, abuf, c->u_mode.ocb.aad_offset, @@ -905,8 +897,6 @@ _gcry_camellia_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, #if defined(USE_AESNI_AVX) || defined(USE_AESNI_AVX2) c->u_mode.ocb.aad_nblocks = blkn; - wipememory(&l_tmp, sizeof(l_tmp)); - if (burn_stack_depth) _gcry_burn_stack (burn_stack_depth + 4 * sizeof(void *)); #endif diff --git a/cipher/cipher-internal.h b/cipher/cipher-internal.h index 01352f3..7204d48 100644 --- a/cipher/cipher-internal.h +++ b/cipher/cipher-internal.h @@ -459,28 +459,28 @@ gcry_err_code_t _gcry_cipher_ocb_get_tag gcry_err_code_t _gcry_cipher_ocb_check_tag /* */ (gcry_cipher_hd_t c, const unsigned char *intag, size_t taglen); -const unsigned char *_gcry_cipher_ocb_get_l -/* */ (gcry_cipher_hd_t c, unsigned char *l_tmp, u64 n); -/* Inline version of _gcry_cipher_ocb_get_l, with hard-coded fast paths for - most common cases. */ +/* Return the L-value for block N. Note: 'cipher_ocb.c' ensures that N + * will never be multiple of 65536 (1 << OCB_L_TABLE_SIZE), thus N can + * be directly passed to _gcry_ctz() function and resulting index will + * never overflow the table. */ static inline const unsigned char * -ocb_get_l (gcry_cipher_hd_t c, unsigned char *l_tmp, u64 n) +ocb_get_l (gcry_cipher_hd_t c, u64 n) { - if (n & 1) - return c->u_mode.ocb.L[0]; - else if (n & 2) - return c->u_mode.ocb.L[1]; - else - { - unsigned int ntz = _gcry_ctz64 (n); - - if (ntz < OCB_L_TABLE_SIZE) - return c->u_mode.ocb.L[ntz]; - else - return _gcry_cipher_ocb_get_l (c, l_tmp, n); - } + unsigned long ntz; + +#if ((defined(__i386__) || defined(__x86_64__)) && __GNUC__ >= 4) + /* Assumes that N != 0. */ + asm ("rep;bsfl %k[low], %k[ntz]\n\t" + : [ntz] "=r" (ntz) + : [low] "r" ((unsigned long)n) + : "cc"); +#else + ntz = _gcry_ctz (n); +#endif + + return c->u_mode.ocb.L[ntz]; } #endif /*G10_CIPHER_INTERNAL_H*/ diff --git a/cipher/cipher-ocb.c b/cipher/cipher-ocb.c index d1f01d5..db42aaf 100644 --- a/cipher/cipher-ocb.c +++ b/cipher/cipher-ocb.c @@ -109,25 +109,17 @@ bit_copy (unsigned char *d, const unsigned char *s, } -/* Return the L-value for block N. In most cases we use the table; - only if the lower OCB_L_TABLE_SIZE bits of N are zero we need to - compute it. With a table size of 16 we need to this this only - every 65536-th block. L_TMP is a helper buffer of size - OCB_BLOCK_LEN which is used to hold the computation if not taken - from the table. */ -const unsigned char * -_gcry_cipher_ocb_get_l (gcry_cipher_hd_t c, unsigned char *l_tmp, u64 n) +/* Get L_big value for block N, where N is multiple of 65536. */ +static void +ocb_get_L_big (gcry_cipher_hd_t c, u64 n, unsigned char *l_buf) { int ntz = _gcry_ctz64 (n); - if (ntz < OCB_L_TABLE_SIZE) - return c->u_mode.ocb.L[ntz]; + gcry_assert(ntz >= OCB_L_TABLE_SIZE); - double_block_cpy (l_tmp, c->u_mode.ocb.L[OCB_L_TABLE_SIZE - 1]); + double_block_cpy (l_buf, c->u_mode.ocb.L[OCB_L_TABLE_SIZE - 1]); for (ntz -= OCB_L_TABLE_SIZE; ntz; ntz--) - double_block (l_tmp); - - return l_tmp; + double_block (l_buf); } @@ -241,7 +233,11 @@ gcry_err_code_t _gcry_cipher_ocb_authenticate (gcry_cipher_hd_t c, const unsigned char *abuf, size_t abuflen) { + const size_t table_maxblks = 1 << OCB_L_TABLE_SIZE; + const u32 table_size_mask = ((1 << OCB_L_TABLE_SIZE) - 1); unsigned char l_tmp[OCB_BLOCK_LEN]; + unsigned int burn = 0; + unsigned int nburn; /* Check that a nonce and thus a key has been set and that we have not yet computed the tag. We also return an error if the aad has @@ -264,14 +260,24 @@ _gcry_cipher_ocb_authenticate (gcry_cipher_hd_t c, const unsigned char *abuf, { c->u_mode.ocb.aad_nblocks++; + if ((c->u_mode.ocb.aad_nblocks % table_maxblks) == 0) + { + /* Table overflow, L needs to be generated. */ + ocb_get_L_big(c, c->u_mode.ocb.aad_nblocks + 1, l_tmp); + } + else + { + buf_cpy (l_tmp, ocb_get_l (c, c->u_mode.ocb.aad_nblocks), + OCB_BLOCK_LEN); + } + /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ - buf_xor_1 (c->u_mode.ocb.aad_offset, - ocb_get_l (c, l_tmp, c->u_mode.ocb.aad_nblocks), - OCB_BLOCK_LEN); + buf_xor_1 (c->u_mode.ocb.aad_offset, l_tmp, OCB_BLOCK_LEN); /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ buf_xor (l_tmp, c->u_mode.ocb.aad_offset, c->u_mode.ocb.aad_leftover, OCB_BLOCK_LEN); - c->spec->encrypt (&c->context.c, l_tmp, l_tmp); + nburn = c->spec->encrypt (&c->context.c, l_tmp, l_tmp); + burn = nburn > burn ? nburn : burn; buf_xor_1 (c->u_mode.ocb.aad_sum, l_tmp, OCB_BLOCK_LEN); c->u_mode.ocb.aad_nleftover = 0; @@ -279,40 +285,83 @@ _gcry_cipher_ocb_authenticate (gcry_cipher_hd_t c, const unsigned char *abuf, } if (!abuflen) - return 0; - - /* Use a bulk method if available. */ - if (abuflen >= OCB_BLOCK_LEN && c->bulk.ocb_auth) { - size_t nblks; - size_t nleft; - size_t ndone; + if (burn > 0) + _gcry_burn_stack (burn + 4*sizeof(void*)); - nblks = abuflen / OCB_BLOCK_LEN; - nleft = c->bulk.ocb_auth (c, abuf, nblks); - ndone = nblks - nleft; - - abuf += ndone * OCB_BLOCK_LEN; - abuflen -= ndone * OCB_BLOCK_LEN; - nblks = nleft; + return 0; } - /* Hash all full blocks. */ + /* Full blocks handling. */ while (abuflen >= OCB_BLOCK_LEN) { - c->u_mode.ocb.aad_nblocks++; + size_t nblks = abuflen / OCB_BLOCK_LEN; + size_t nmaxblks; - /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ - buf_xor_1 (c->u_mode.ocb.aad_offset, - ocb_get_l (c, l_tmp, c->u_mode.ocb.aad_nblocks), - OCB_BLOCK_LEN); - /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ - buf_xor (l_tmp, c->u_mode.ocb.aad_offset, abuf, OCB_BLOCK_LEN); - c->spec->encrypt (&c->context.c, l_tmp, l_tmp); - buf_xor_1 (c->u_mode.ocb.aad_sum, l_tmp, OCB_BLOCK_LEN); + /* Check how many blocks to process till table overflow. */ + nmaxblks = (c->u_mode.ocb.aad_nblocks + 1) % table_maxblks; + nmaxblks = (table_maxblks - nmaxblks) % table_maxblks; + + if (nmaxblks == 0) + { + /* Table overflow, generate L and process one block. */ + c->u_mode.ocb.aad_nblocks++; + ocb_get_L_big(c, c->u_mode.ocb.aad_nblocks, l_tmp); + + /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ + buf_xor_1 (c->u_mode.ocb.aad_offset, l_tmp, OCB_BLOCK_LEN); + /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ + buf_xor (l_tmp, c->u_mode.ocb.aad_offset, abuf, OCB_BLOCK_LEN); + nburn = c->spec->encrypt (&c->context.c, l_tmp, l_tmp); + burn = nburn > burn ? nburn : burn; + buf_xor_1 (c->u_mode.ocb.aad_sum, l_tmp, OCB_BLOCK_LEN); + + abuf += OCB_BLOCK_LEN; + abuflen -= OCB_BLOCK_LEN; + nblks--; + + /* With overflow handled, retry loop again. Next overflow will + * happen after 65535 blocks. */ + continue; + } + + nblks = nblks < nmaxblks ? nblks : nmaxblks; + + /* Use a bulk method if available. */ + if (nblks && c->bulk.ocb_auth) + { + size_t nleft; + size_t ndone; + + nleft = c->bulk.ocb_auth (c, abuf, nblks); + ndone = nblks - nleft; + + abuf += ndone * OCB_BLOCK_LEN; + abuflen -= ndone * OCB_BLOCK_LEN; + nblks = nleft; + } + + /* Hash all full blocks. */ + while (nblks) + { + c->u_mode.ocb.aad_nblocks++; + + gcry_assert(c->u_mode.ocb.aad_nblocks & table_size_mask); + + /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ + buf_xor_1 (c->u_mode.ocb.aad_offset, + ocb_get_l (c, c->u_mode.ocb.aad_nblocks), + OCB_BLOCK_LEN); + /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ + buf_xor (l_tmp, c->u_mode.ocb.aad_offset, abuf, OCB_BLOCK_LEN); + nburn = c->spec->encrypt (&c->context.c, l_tmp, l_tmp); + burn = nburn > burn ? nburn : burn; + buf_xor_1 (c->u_mode.ocb.aad_sum, l_tmp, OCB_BLOCK_LEN); - abuf += OCB_BLOCK_LEN; - abuflen -= OCB_BLOCK_LEN; + abuf += OCB_BLOCK_LEN; + abuflen -= OCB_BLOCK_LEN; + nblks--; + } } /* Store away the remaining data. */ @@ -321,6 +370,9 @@ _gcry_cipher_ocb_authenticate (gcry_cipher_hd_t c, const unsigned char *abuf, c->u_mode.ocb.aad_leftover[c->u_mode.ocb.aad_nleftover++] = *abuf; gcry_assert (!abuflen); + if (burn > 0) + _gcry_burn_stack (burn + 4*sizeof(void*)); + return 0; } @@ -330,6 +382,8 @@ static void ocb_aad_finalize (gcry_cipher_hd_t c) { unsigned char l_tmp[OCB_BLOCK_LEN]; + unsigned int burn = 0; + unsigned int nburn; /* Check that a nonce and thus a key has been set and that we have not yet computed the tag. We also skip this if the aad has been @@ -352,7 +406,8 @@ ocb_aad_finalize (gcry_cipher_hd_t c) l_tmp[c->u_mode.ocb.aad_nleftover] = 0x80; buf_xor_1 (l_tmp, c->u_mode.ocb.aad_offset, OCB_BLOCK_LEN); /* Sum = Sum_m xor ENCIPHER(K, CipherInput) */ - c->spec->encrypt (&c->context.c, l_tmp, l_tmp); + nburn = c->spec->encrypt (&c->context.c, l_tmp, l_tmp); + burn = nburn > burn ? nburn : burn; buf_xor_1 (c->u_mode.ocb.aad_sum, l_tmp, OCB_BLOCK_LEN); c->u_mode.ocb.aad_nleftover = 0; @@ -361,6 +416,9 @@ ocb_aad_finalize (gcry_cipher_hd_t c) /* Mark AAD as finalized so that gcry_cipher_ocb_authenticate can * return an erro when called again. */ c->u_mode.ocb.aad_finalized = 1; + + if (burn > 0) + _gcry_burn_stack (burn + 4*sizeof(void*)); } @@ -387,10 +445,13 @@ ocb_crypt (gcry_cipher_hd_t c, int encrypt, unsigned char *outbuf, size_t outbuflen, const unsigned char *inbuf, size_t inbuflen) { + const size_t table_maxblks = 1 << OCB_L_TABLE_SIZE; + const u32 table_size_mask = ((1 << OCB_L_TABLE_SIZE) - 1); unsigned char l_tmp[OCB_BLOCK_LEN]; unsigned int burn = 0; unsigned int nburn; - size_t nblks = inbuflen / OCB_BLOCK_LEN; + gcry_cipher_encrypt_t crypt_fn = + encrypt ? c->spec->encrypt : c->spec->decrypt; /* Check that a nonce and thus a key has been set and that we are not yet in end of data state. */ @@ -407,58 +468,112 @@ ocb_crypt (gcry_cipher_hd_t c, int encrypt, else if ((inbuflen % OCB_BLOCK_LEN)) return GPG_ERR_INV_LENGTH; /* We support only full blocks for now. */ - /* Use a bulk method if available. */ - if (nblks && c->bulk.ocb_crypt) - { - size_t nleft; - size_t ndone; - - nleft = c->bulk.ocb_crypt (c, outbuf, inbuf, nblks, encrypt); - ndone = nblks - nleft; - - inbuf += ndone * OCB_BLOCK_LEN; - outbuf += ndone * OCB_BLOCK_LEN; - inbuflen -= ndone * OCB_BLOCK_LEN; - outbuflen -= ndone * OCB_BLOCK_LEN; - nblks = nleft; - } - - if (nblks) + /* Full blocks handling. */ + while (inbuflen >= OCB_BLOCK_LEN) { - gcry_cipher_encrypt_t crypt_fn = - encrypt ? c->spec->encrypt : c->spec->decrypt; + size_t nblks = inbuflen / OCB_BLOCK_LEN; + size_t nmaxblks; - if (encrypt) - { - /* Checksum_i = Checksum_{i-1} xor P_i */ - ocb_checksum (c->u_ctr.ctr, inbuf, nblks); - } + /* Check how many blocks to process till table overflow. */ + nmaxblks = (c->u_mode.ocb.data_nblocks + 1) % table_maxblks; + nmaxblks = (table_maxblks - nmaxblks) % table_maxblks; - /* Encrypt all full blocks. */ - while (inbuflen >= OCB_BLOCK_LEN) + if (nmaxblks == 0) { + /* Table overflow, generate L and process one block. */ c->u_mode.ocb.data_nblocks++; + ocb_get_L_big(c, c->u_mode.ocb.data_nblocks, l_tmp); + + if (encrypt) + { + /* Checksum_i = Checksum_{i-1} xor P_i */ + ocb_checksum (c->u_ctr.ctr, inbuf, 1); + } /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ - buf_xor_1 (c->u_iv.iv, - ocb_get_l (c, l_tmp, c->u_mode.ocb.data_nblocks), - OCB_BLOCK_LEN); + buf_xor_1 (c->u_iv.iv, l_tmp, OCB_BLOCK_LEN); /* C_i = Offset_i xor ENCIPHER(K, P_i xor Offset_i) */ buf_xor (outbuf, c->u_iv.iv, inbuf, OCB_BLOCK_LEN); nburn = crypt_fn (&c->context.c, outbuf, outbuf); burn = nburn > burn ? nburn : burn; buf_xor_1 (outbuf, c->u_iv.iv, OCB_BLOCK_LEN); + if (!encrypt) + { + /* Checksum_i = Checksum_{i-1} xor P_i */ + ocb_checksum (c->u_ctr.ctr, outbuf, 1); + } + inbuf += OCB_BLOCK_LEN; inbuflen -= OCB_BLOCK_LEN; outbuf += OCB_BLOCK_LEN; outbuflen =- OCB_BLOCK_LEN; + nblks--; + + /* With overflow handled, retry loop again. Next overflow will + * happen after 65535 blocks. */ + continue; + } + + nblks = nblks < nmaxblks ? nblks : nmaxblks; + + /* Use a bulk method if available. */ + if (nblks && c->bulk.ocb_crypt) + { + size_t nleft; + size_t ndone; + + nleft = c->bulk.ocb_crypt (c, outbuf, inbuf, nblks, encrypt); + ndone = nblks - nleft; + + inbuf += ndone * OCB_BLOCK_LEN; + outbuf += ndone * OCB_BLOCK_LEN; + inbuflen -= ndone * OCB_BLOCK_LEN; + outbuflen -= ndone * OCB_BLOCK_LEN; + nblks = nleft; } - if (!encrypt) + if (nblks) { - /* Checksum_i = Checksum_{i-1} xor P_i */ - ocb_checksum (c->u_ctr.ctr, outbuf - nblks * OCB_BLOCK_LEN, nblks); + size_t nblks_chksum = nblks; + + if (encrypt) + { + /* Checksum_i = Checksum_{i-1} xor P_i */ + ocb_checksum (c->u_ctr.ctr, inbuf, nblks_chksum); + } + + /* Encrypt all full blocks. */ + while (nblks) + { + c->u_mode.ocb.data_nblocks++; + + gcry_assert(c->u_mode.ocb.data_nblocks & table_size_mask); + + /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ + buf_xor_1 (c->u_iv.iv, + ocb_get_l (c, c->u_mode.ocb.data_nblocks), + OCB_BLOCK_LEN); + /* C_i = Offset_i xor ENCIPHER(K, P_i xor Offset_i) */ + buf_xor (outbuf, c->u_iv.iv, inbuf, OCB_BLOCK_LEN); + nburn = crypt_fn (&c->context.c, outbuf, outbuf); + burn = nburn > burn ? nburn : burn; + buf_xor_1 (outbuf, c->u_iv.iv, OCB_BLOCK_LEN); + + inbuf += OCB_BLOCK_LEN; + inbuflen -= OCB_BLOCK_LEN; + outbuf += OCB_BLOCK_LEN; + outbuflen =- OCB_BLOCK_LEN; + nblks--; + } + + if (!encrypt) + { + /* Checksum_i = Checksum_{i-1} xor P_i */ + ocb_checksum (c->u_ctr.ctr, + outbuf - nblks_chksum * OCB_BLOCK_LEN, + nblks_chksum); + } } } diff --git a/cipher/rijndael-aesni.c b/cipher/rijndael-aesni.c index 8b28b3a..7852e19 100644 --- a/cipher/rijndael-aesni.c +++ b/cipher/rijndael-aesni.c @@ -1331,74 +1331,10 @@ _gcry_aes_aesni_cbc_dec (RIJNDAEL_context *ctx, unsigned char *outbuf, } -static inline const unsigned char * -get_l (gcry_cipher_hd_t c, unsigned char *l_tmp, u64 i, unsigned char *iv, - unsigned char *ctr) -{ - const unsigned char *l; - unsigned int ntz; - - if (i & 0xffffffffU) - { - asm ("rep;bsf %k[low], %k[ntz]\n\t" - : [ntz] "=r" (ntz) - : [low] "r" (i & 0xffffffffU) - : "cc"); - } - else - { - if (OCB_L_TABLE_SIZE < 32) - { - ntz = 32; - } - else if (i) - { - asm ("rep;bsf %k[high], %k[ntz]\n\t" - : [ntz] "=r" (ntz) - : [high] "r" (i >> 32) - : "cc"); - ntz += 32; - } - else - { - ntz = 64; - } - } - - if (ntz < OCB_L_TABLE_SIZE) - { - l = c->u_mode.ocb.L[ntz]; - } - else - { - /* Store Offset & Checksum before calling external function */ - asm volatile ("movdqu %%xmm5, %[iv]\n\t" - "movdqu %%xmm6, %[ctr]\n\t" - : [iv] "=m" (*iv), - [ctr] "=m" (*ctr) - : - : "memory" ); - - l = _gcry_cipher_ocb_get_l (c, l_tmp, i); - - /* Restore Offset & Checksum */ - asm volatile ("movdqu %[iv], %%xmm5\n\t" - "movdqu %[ctr], %%xmm6\n\t" - : /* No output */ - : [iv] "m" (*iv), - [ctr] "m" (*ctr) - : "memory" ); - } - - return l; -} - - static void aesni_ocb_enc (gcry_cipher_hd_t c, void *outbuf_arg, const void *inbuf_arg, size_t nblocks) { - union { unsigned char x1[16] ATTR_ALIGNED_16; u32 x32[4]; } l_tmp; RIJNDAEL_context *ctx = (void *)&c->context.c; unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; @@ -1420,7 +1356,7 @@ aesni_ocb_enc (gcry_cipher_hd_t c, void *outbuf_arg, for ( ;nblocks && n % 4; nblocks-- ) { - l = get_l(c, l_tmp.x1, ++n, c->u_iv.iv, c->u_ctr.ctr); + l = ocb_get_l(c, ++n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* Checksum_i = Checksum_{i-1} xor P_i */ @@ -1449,9 +1385,8 @@ aesni_ocb_enc (gcry_cipher_hd_t c, void *outbuf_arg, for ( ;nblocks > 3 ; nblocks -= 4 ) { - /* l_tmp will be used only every 65536-th block. */ n += 4; - l = get_l(c, l_tmp.x1, n, c->u_iv.iv, c->u_ctr.ctr); + l = ocb_get_l(c, n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* Checksum_i = Checksum_{i-1} xor P_i */ @@ -1522,7 +1457,7 @@ aesni_ocb_enc (gcry_cipher_hd_t c, void *outbuf_arg, for ( ;nblocks; nblocks-- ) { - l = get_l(c, l_tmp.x1, ++n, c->u_iv.iv, c->u_ctr.ctr); + l = ocb_get_l(c, ++n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* Checksum_i = Checksum_{i-1} xor P_i */ @@ -1559,8 +1494,6 @@ aesni_ocb_enc (gcry_cipher_hd_t c, void *outbuf_arg, aesni_cleanup (); aesni_cleanup_2_6 (); - - wipememory(&l_tmp, sizeof(l_tmp)); } @@ -1568,7 +1501,6 @@ static void aesni_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, const void *inbuf_arg, size_t nblocks) { - union { unsigned char x1[16] ATTR_ALIGNED_16; u32 x32[4]; } l_tmp; RIJNDAEL_context *ctx = (void *)&c->context.c; unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; @@ -1589,7 +1521,7 @@ aesni_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, for ( ;nblocks && n % 4; nblocks-- ) { - l = get_l(c, l_tmp.x1, ++n, c->u_iv.iv, c->u_ctr.ctr); + l = ocb_get_l(c, ++n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* P_i = Offset_i xor DECIPHER(K, C_i xor Offset_i) */ @@ -1618,9 +1550,8 @@ aesni_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, for ( ;nblocks > 3 ; nblocks -= 4 ) { - /* l_tmp will be used only every 65536-th block. */ n += 4; - l = get_l(c, l_tmp.x1, n, c->u_iv.iv, c->u_ctr.ctr); + l = ocb_get_l(c, n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* P_i = Offset_i xor DECIPHER(K, C_i xor Offset_i) */ @@ -1691,7 +1622,7 @@ aesni_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, for ( ;nblocks; nblocks-- ) { - l = get_l(c, l_tmp.x1, ++n, c->u_iv.iv, c->u_ctr.ctr); + l = ocb_get_l(c, ++n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* P_i = Offset_i xor DECIPHER(K, C_i xor Offset_i) */ @@ -1728,8 +1659,6 @@ aesni_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, aesni_cleanup (); aesni_cleanup_2_6 (); - - wipememory(&l_tmp, sizeof(l_tmp)); } @@ -1748,7 +1677,6 @@ void _gcry_aes_aesni_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, size_t nblocks) { - union { unsigned char x1[16] ATTR_ALIGNED_16; u32 x32[4]; } l_tmp; RIJNDAEL_context *ctx = (void *)&c->context.c; const unsigned char *abuf = abuf_arg; u64 n = c->u_mode.ocb.aad_nblocks; @@ -1768,8 +1696,7 @@ _gcry_aes_aesni_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, for ( ;nblocks && n % 4; nblocks-- ) { - l = get_l(c, l_tmp.x1, ++n, c->u_mode.ocb.aad_offset, - c->u_mode.ocb.aad_sum); + l = ocb_get_l(c, ++n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ @@ -1794,10 +1721,8 @@ _gcry_aes_aesni_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, for ( ;nblocks > 3 ; nblocks -= 4 ) { - /* l_tmp will be used only every 65536-th block. */ n += 4; - l = get_l(c, l_tmp.x1, n, c->u_mode.ocb.aad_offset, - c->u_mode.ocb.aad_sum); + l = ocb_get_l(c, n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ @@ -1849,8 +1774,7 @@ _gcry_aes_aesni_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, for ( ;nblocks; nblocks-- ) { - l = get_l(c, l_tmp.x1, ++n, c->u_mode.ocb.aad_offset, - c->u_mode.ocb.aad_sum); + l = ocb_get_l(c, ++n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ @@ -1883,8 +1807,6 @@ _gcry_aes_aesni_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, aesni_cleanup (); aesni_cleanup_2_6 (); - - wipememory(&l_tmp, sizeof(l_tmp)); } diff --git a/cipher/rijndael-armv8-ce.c b/cipher/rijndael-armv8-ce.c index bed4066..1bf74da 100644 --- a/cipher/rijndael-armv8-ce.c +++ b/cipher/rijndael-armv8-ce.c @@ -336,7 +336,6 @@ _gcry_aes_armv8_ce_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, u64 blkn = c->u_mode.ocb.data_nblocks; u64 blkn_offs = blkn - blkn % 32; unsigned int n = 32 - blkn % 32; - unsigned char l_tmp[16]; void *Ls[32]; void **l; size_t i; @@ -364,9 +363,8 @@ _gcry_aes_armv8_ce_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, /* Process data in 32 block chunks. */ while (nblocks >= 32) { - /* l_tmp will be used only every 65536-th block. */ blkn_offs += 32; - *l = (void *)ocb_get_l(c, l_tmp, blkn_offs); + *l = (void *)ocb_get_l(c, blkn_offs); crypt_fn(keysched, outbuf, inbuf, c->u_iv.iv, c->u_ctr.ctr, Ls, 32, nrounds); @@ -378,13 +376,13 @@ _gcry_aes_armv8_ce_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, if (nblocks && l < &Ls[nblocks]) { - *l = (void *)ocb_get_l(c, l_tmp, 32 + blkn_offs); + *l = (void *)ocb_get_l(c, 32 + blkn_offs); } } else { for (i = 0; i < nblocks; i++) - Ls[i] = (void *)ocb_get_l(c, l_tmp, ++blkn); + Ls[i] = (void *)ocb_get_l(c, ++blkn); } if (nblocks) @@ -392,8 +390,6 @@ _gcry_aes_armv8_ce_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, crypt_fn(keysched, outbuf, inbuf, c->u_iv.iv, c->u_ctr.ctr, Ls, nblocks, nrounds); } - - wipememory(&l_tmp, sizeof(l_tmp)); } void @@ -407,7 +403,6 @@ _gcry_aes_armv8_ce_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, u64 blkn = c->u_mode.ocb.aad_nblocks; u64 blkn_offs = blkn - blkn % 32; unsigned int n = 32 - blkn % 32; - unsigned char l_tmp[16]; void *Ls[32]; void **l; size_t i; @@ -435,9 +430,8 @@ _gcry_aes_armv8_ce_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, /* Process data in 32 block chunks. */ while (nblocks >= 32) { - /* l_tmp will be used only every 65536-th block. */ blkn_offs += 32; - *l = (void *)ocb_get_l(c, l_tmp, blkn_offs); + *l = (void *)ocb_get_l(c, blkn_offs); _gcry_aes_ocb_auth_armv8_ce(keysched, abuf, c->u_mode.ocb.aad_offset, c->u_mode.ocb.aad_sum, Ls, 32, nrounds); @@ -448,13 +442,13 @@ _gcry_aes_armv8_ce_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, if (nblocks && l < &Ls[nblocks]) { - *l = (void *)ocb_get_l(c, l_tmp, 32 + blkn_offs); + *l = (void *)ocb_get_l(c, 32 + blkn_offs); } } else { for (i = 0; i < nblocks; i++) - Ls[i] = (void *)ocb_get_l(c, l_tmp, ++blkn); + Ls[i] = (void *)ocb_get_l(c, ++blkn); } if (nblocks) @@ -462,8 +456,6 @@ _gcry_aes_armv8_ce_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, _gcry_aes_ocb_auth_armv8_ce(keysched, abuf, c->u_mode.ocb.aad_offset, c->u_mode.ocb.aad_sum, Ls, nblocks, nrounds); } - - wipememory(&l_tmp, sizeof(l_tmp)); } #endif /* USE_ARM_CE */ diff --git a/cipher/rijndael-ssse3-amd64.c b/cipher/rijndael-ssse3-amd64.c index 937d868..a8e89d4 100644 --- a/cipher/rijndael-ssse3-amd64.c +++ b/cipher/rijndael-ssse3-amd64.c @@ -527,92 +527,10 @@ _gcry_aes_ssse3_cbc_dec (RIJNDAEL_context *ctx, unsigned char *outbuf, } -static inline const unsigned char * -get_l (gcry_cipher_hd_t c, unsigned char *l_tmp, u64 i, unsigned char *iv, - unsigned char *ctr, const void **aes_const_ptr, - byte ssse3_state[SSSE3_STATE_SIZE], int encrypt) -{ - const unsigned char *l; - unsigned int ntz; - - if (i & 1) - return c->u_mode.ocb.L[0]; - else if (i & 2) - return c->u_mode.ocb.L[1]; - else if (i & 0xffffffffU) - { - asm ("rep;bsf %k[low], %k[ntz]\n\t" - : [ntz] "=r" (ntz) - : [low] "r" (i & 0xffffffffU) - : "cc"); - } - else - { - if (OCB_L_TABLE_SIZE < 32) - { - ntz = 32; - } - else if (i) - { - asm ("rep;bsf %k[high], %k[ntz]\n\t" - : [ntz] "=r" (ntz) - : [high] "r" (i >> 32) - : "cc"); - ntz += 32; - } - else - { - ntz = 64; - } - } - - if (ntz < OCB_L_TABLE_SIZE) - { - l = c->u_mode.ocb.L[ntz]; - } - else - { - /* Store Offset & Checksum before calling external function */ - asm volatile ("movdqu %%xmm7, %[iv]\n\t" - "movdqu %%xmm6, %[ctr]\n\t" - : [iv] "=m" (*iv), - [ctr] "=m" (*ctr) - : - : "memory" ); - - /* Restore SSSE3 state. */ - vpaes_ssse3_cleanup(); - - l = _gcry_cipher_ocb_get_l (c, l_tmp, i); - - /* Save SSSE3 state. */ - if (encrypt) - { - vpaes_ssse3_prepare_enc (*aes_const_ptr); - } - else - { - vpaes_ssse3_prepare_dec (*aes_const_ptr); - } - - /* Restore Offset & Checksum */ - asm volatile ("movdqu %[iv], %%xmm7\n\t" - "movdqu %[ctr], %%xmm6\n\t" - : /* No output */ - : [iv] "m" (*iv), - [ctr] "m" (*ctr) - : "memory" ); - } - - return l; -} - - static void ssse3_ocb_enc (gcry_cipher_hd_t c, void *outbuf_arg, const void *inbuf_arg, size_t nblocks) { - union { unsigned char x1[16] ATTR_ALIGNED_16; u32 x32[4]; } l_tmp; RIJNDAEL_context *ctx = (void *)&c->context.c; unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; @@ -635,8 +553,7 @@ ssse3_ocb_enc (gcry_cipher_hd_t c, void *outbuf_arg, { const unsigned char *l; - l = get_l(c, l_tmp.x1, ++n, c->u_iv.iv, c->u_ctr.ctr, &aes_const_ptr, - ssse3_state, 1); + l = ocb_get_l(c, ++n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* Checksum_i = Checksum_{i-1} xor P_i */ @@ -671,7 +588,6 @@ ssse3_ocb_enc (gcry_cipher_hd_t c, void *outbuf_arg, : : "memory" ); - wipememory(&l_tmp, sizeof(l_tmp)); vpaes_ssse3_cleanup (); } @@ -679,7 +595,6 @@ static void ssse3_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, const void *inbuf_arg, size_t nblocks) { - union { unsigned char x1[16] ATTR_ALIGNED_16; u32 x32[4]; } l_tmp; RIJNDAEL_context *ctx = (void *)&c->context.c; unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; @@ -702,8 +617,7 @@ ssse3_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, { const unsigned char *l; - l = get_l(c, l_tmp.x1, ++n, c->u_iv.iv, c->u_ctr.ctr, &aes_const_ptr, - ssse3_state, 0); + l = ocb_get_l(c, ++n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* P_i = Offset_i xor DECIPHER(K, C_i xor Offset_i) */ @@ -738,7 +652,6 @@ ssse3_ocb_dec (gcry_cipher_hd_t c, void *outbuf_arg, : : "memory" ); - wipememory(&l_tmp, sizeof(l_tmp)); vpaes_ssse3_cleanup (); } @@ -758,7 +671,6 @@ void _gcry_aes_ssse3_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, size_t nblocks) { - union { unsigned char x1[16] ATTR_ALIGNED_16; u32 x32[4]; } l_tmp; RIJNDAEL_context *ctx = (void *)&c->context.c; const unsigned char *abuf = abuf_arg; u64 n = c->u_mode.ocb.aad_nblocks; @@ -780,8 +692,7 @@ _gcry_aes_ssse3_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, { const unsigned char *l; - l = get_l(c, l_tmp.x1, ++n, c->u_mode.ocb.aad_offset, - c->u_mode.ocb.aad_sum, &aes_const_ptr, ssse3_state, 1); + l = ocb_get_l(c, ++n); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ /* Sum_i = Sum_{i-1} xor ENCIPHER(K, A_i xor Offset_i) */ @@ -812,7 +723,6 @@ _gcry_aes_ssse3_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, : : "memory" ); - wipememory(&l_tmp, sizeof(l_tmp)); vpaes_ssse3_cleanup (); } diff --git a/cipher/rijndael.c b/cipher/rijndael.c index cc6a722..66ea0f3 100644 --- a/cipher/rijndael.c +++ b/cipher/rijndael.c @@ -1353,7 +1353,7 @@ _gcry_aes_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, for ( ;nblocks; nblocks-- ) { u64 i = ++c->u_mode.ocb.data_nblocks; - const unsigned char *l = ocb_get_l(c, l_tmp.x1, i); + const unsigned char *l = ocb_get_l(c, i); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ buf_xor_1 (c->u_iv.iv, l, BLOCKSIZE); @@ -1378,7 +1378,7 @@ _gcry_aes_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, for ( ;nblocks; nblocks-- ) { u64 i = ++c->u_mode.ocb.data_nblocks; - const unsigned char *l = ocb_get_l(c, l_tmp.x1, i); + const unsigned char *l = ocb_get_l(c, i); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ buf_xor_1 (c->u_iv.iv, l, BLOCKSIZE); @@ -1445,7 +1445,7 @@ _gcry_aes_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, size_t nblocks) for ( ;nblocks; nblocks-- ) { u64 i = ++c->u_mode.ocb.aad_nblocks; - const unsigned char *l = ocb_get_l(c, l_tmp.x1, i); + const unsigned char *l = ocb_get_l(c, i); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ buf_xor_1 (c->u_mode.ocb.aad_offset, l, BLOCKSIZE); diff --git a/cipher/serpent.c b/cipher/serpent.c index ef19d3b..ea4b8ed 100644 --- a/cipher/serpent.c +++ b/cipher/serpent.c @@ -1235,7 +1235,6 @@ _gcry_serpent_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, serpent_context_t *ctx = (void *)&c->context.c; unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; - unsigned char l_tmp[sizeof(serpent_block_t)]; int burn_stack_depth = 2 * sizeof (serpent_block_t); u64 blkn = c->u_mode.ocb.data_nblocks; #else @@ -1275,9 +1274,8 @@ _gcry_serpent_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, /* Process data in 16 block chunks. */ while (nblocks >= 16) { - /* l_tmp will be used only every 65536-th block. */ blkn += 16; - *l = (uintptr_t)(void *)ocb_get_l(c, l_tmp, blkn - blkn % 16); + *l = (uintptr_t)(void *)ocb_get_l(c, blkn - blkn % 16); if (encrypt) _gcry_serpent_avx2_ocb_enc(ctx, outbuf, inbuf, c->u_iv.iv, @@ -1327,9 +1325,8 @@ _gcry_serpent_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, /* Process data in 8 block chunks. */ while (nblocks >= 8) { - /* l_tmp will be used only every 65536-th block. */ blkn += 8; - *l = (uintptr_t)(void *)ocb_get_l(c, l_tmp, blkn - blkn % 8); + *l = (uintptr_t)(void *)ocb_get_l(c, blkn - blkn % 8); if (encrypt) _gcry_serpent_sse2_ocb_enc(ctx, outbuf, inbuf, c->u_iv.iv, @@ -1378,9 +1375,8 @@ _gcry_serpent_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, /* Process data in 8 block chunks. */ while (nblocks >= 8) { - /* l_tmp will be used only every 65536-th block. */ blkn += 8; - *l = ocb_get_l(c, l_tmp, blkn - blkn % 8); + *l = ocb_get_l(c, blkn - blkn % 8); if (encrypt) _gcry_serpent_neon_ocb_enc(ctx, outbuf, inbuf, c->u_iv.iv, @@ -1410,8 +1406,6 @@ _gcry_serpent_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, #if defined(USE_AVX2) || defined(USE_SSE2) || defined(USE_NEON) c->u_mode.ocb.data_nblocks = blkn; - wipememory(&l_tmp, sizeof(l_tmp)); - if (burn_stack_depth) _gcry_burn_stack (burn_stack_depth + 4 * sizeof(void *)); #endif @@ -1427,7 +1421,6 @@ _gcry_serpent_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, #if defined(USE_AVX2) || defined(USE_SSE2) || defined(USE_NEON) serpent_context_t *ctx = (void *)&c->context.c; const unsigned char *abuf = abuf_arg; - unsigned char l_tmp[sizeof(serpent_block_t)]; int burn_stack_depth = 2 * sizeof(serpent_block_t); u64 blkn = c->u_mode.ocb.aad_nblocks; #else @@ -1465,9 +1458,8 @@ _gcry_serpent_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, /* Process data in 16 block chunks. */ while (nblocks >= 16) { - /* l_tmp will be used only every 65536-th block. */ blkn += 16; - *l = (uintptr_t)(void *)ocb_get_l(c, l_tmp, blkn - blkn % 16); + *l = (uintptr_t)(void *)ocb_get_l(c, blkn - blkn % 16); _gcry_serpent_avx2_ocb_auth(ctx, abuf, c->u_mode.ocb.aad_offset, c->u_mode.ocb.aad_sum, Ls); @@ -1512,9 +1504,8 @@ _gcry_serpent_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, /* Process data in 8 block chunks. */ while (nblocks >= 8) { - /* l_tmp will be used only every 65536-th block. */ blkn += 8; - *l = (uintptr_t)(void *)ocb_get_l(c, l_tmp, blkn - blkn % 8); + *l = (uintptr_t)(void *)ocb_get_l(c, blkn - blkn % 8); _gcry_serpent_sse2_ocb_auth(ctx, abuf, c->u_mode.ocb.aad_offset, c->u_mode.ocb.aad_sum, Ls); @@ -1558,9 +1549,8 @@ _gcry_serpent_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, /* Process data in 8 block chunks. */ while (nblocks >= 8) { - /* l_tmp will be used only every 65536-th block. */ blkn += 8; - *l = ocb_get_l(c, l_tmp, blkn - blkn % 8); + *l = ocb_get_l(c, blkn - blkn % 8); _gcry_serpent_neon_ocb_auth(ctx, abuf, c->u_mode.ocb.aad_offset, c->u_mode.ocb.aad_sum, Ls); @@ -1585,8 +1575,6 @@ _gcry_serpent_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, #if defined(USE_AVX2) || defined(USE_SSE2) || defined(USE_NEON) c->u_mode.ocb.aad_nblocks = blkn; - wipememory(&l_tmp, sizeof(l_tmp)); - if (burn_stack_depth) _gcry_burn_stack (burn_stack_depth + 4 * sizeof(void *)); #endif diff --git a/cipher/twofish.c b/cipher/twofish.c index 7a4d26a..55f6fb9 100644 --- a/cipher/twofish.c +++ b/cipher/twofish.c @@ -1261,7 +1261,6 @@ _gcry_twofish_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, TWOFISH_context *ctx = (void *)&c->context.c; unsigned char *outbuf = outbuf_arg; const unsigned char *inbuf = inbuf_arg; - unsigned char l_tmp[TWOFISH_BLOCKSIZE]; unsigned int burn, burn_stack_depth = 0; u64 blkn = c->u_mode.ocb.data_nblocks; @@ -1273,10 +1272,9 @@ _gcry_twofish_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, /* Process data in 3 block chunks. */ while (nblocks >= 3) { - /* l_tmp will be used only every 65536-th block. */ - Ls[0] = (uintptr_t)(const void *)ocb_get_l(c, l_tmp, blkn + 1); - Ls[1] = (uintptr_t)(const void *)ocb_get_l(c, l_tmp, blkn + 2); - Ls[2] = (uintptr_t)(const void *)ocb_get_l(c, l_tmp, blkn + 3); + Ls[0] = (uintptr_t)(const void *)ocb_get_l(c, blkn + 1); + Ls[1] = (uintptr_t)(const void *)ocb_get_l(c, blkn + 2); + Ls[2] = (uintptr_t)(const void *)ocb_get_l(c, blkn + 3); blkn += 3; if (encrypt) @@ -1300,8 +1298,6 @@ _gcry_twofish_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, c->u_mode.ocb.data_nblocks = blkn; - wipememory(&l_tmp, sizeof(l_tmp)); - if (burn_stack_depth) _gcry_burn_stack (burn_stack_depth + 4 * sizeof(void *)); #else @@ -1322,7 +1318,6 @@ _gcry_twofish_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, #ifdef USE_AMD64_ASM TWOFISH_context *ctx = (void *)&c->context.c; const unsigned char *abuf = abuf_arg; - unsigned char l_tmp[TWOFISH_BLOCKSIZE]; unsigned int burn, burn_stack_depth = 0; u64 blkn = c->u_mode.ocb.aad_nblocks; @@ -1334,10 +1329,9 @@ _gcry_twofish_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, /* Process data in 3 block chunks. */ while (nblocks >= 3) { - /* l_tmp will be used only every 65536-th block. */ - Ls[0] = (uintptr_t)(const void *)ocb_get_l(c, l_tmp, blkn + 1); - Ls[1] = (uintptr_t)(const void *)ocb_get_l(c, l_tmp, blkn + 2); - Ls[2] = (uintptr_t)(const void *)ocb_get_l(c, l_tmp, blkn + 3); + Ls[0] = (uintptr_t)(const void *)ocb_get_l(c, blkn + 1); + Ls[1] = (uintptr_t)(const void *)ocb_get_l(c, blkn + 2); + Ls[2] = (uintptr_t)(const void *)ocb_get_l(c, blkn + 3); blkn += 3; twofish_amd64_ocb_auth(ctx, abuf, c->u_mode.ocb.aad_offset, @@ -1356,8 +1350,6 @@ _gcry_twofish_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, c->u_mode.ocb.aad_nblocks = blkn; - wipememory(&l_tmp, sizeof(l_tmp)); - if (burn_stack_depth) _gcry_burn_stack (burn_stack_depth + 4 * sizeof(void *)); #else commit 161d339f48c03be7fd0f4249d730f7f1767ef8e4 Author: Jussi Kivilinna Date: Sat Dec 10 12:29:12 2016 +0200 OCB: remove 'int64_t' usage * cipher/cipher-ocb.c (double_block): Use alternative way to generate sign-bit mask, without 'int64_t'. -- Signed-off-by: Jussi Kivilinna diff --git a/cipher/cipher-ocb.c b/cipher/cipher-ocb.c index 92260d2..d1f01d5 100644 --- a/cipher/cipher-ocb.c +++ b/cipher/cipher-ocb.c @@ -66,7 +66,7 @@ double_block (unsigned char *b) l = buf_get_be64 (b); r = buf_get_be64 (b + 8); - l_0 = (int64_t)l >> 63; + l_0 = -(l >> 63); l = (l + l) ^ (r >> 63); r = (r + r) ^ (l_0 & 135); commit 0b03b658bebc69a84d87ef13f9b60a27b0c42305 Author: Jussi Kivilinna Date: Sat Dec 10 12:29:12 2016 +0200 random-drbg: use bufhelp function for big-endian store * random/random-drbg.c (drbg_cpu_to_be32): Remove. (drbg_ctr_df, drbg_hash_df): Use 'buf_put_be32' instead of 'drbg_cpu_to_be32'. -- Signed-off-by: Jussi Kivilinna diff --git a/random/random-drbg.c b/random/random-drbg.c index e2fe861..baaa65a 100644 --- a/random/random-drbg.c +++ b/random/random-drbg.c @@ -155,7 +155,7 @@ #include "g10lib.h" #include "random.h" #include "rand-internal.h" -#include "../cipher/bithelp.h" +#include "../cipher/bufhelp.h" @@ -546,27 +546,6 @@ drbg_sec_strength (u32 flags) return 32; } -/* - * Convert an integer into a byte representation of this integer. - * The byte representation is big-endian - * - * @val value to be converted - * @buf buffer holding the converted integer -- caller must ensure that - * buffer size is at least 32 bit - */ -static inline void -drbg_cpu_to_be32 (u32 val, unsigned char *buf) -{ - /* FIXME: This may raise a bus error. */ - struct s - { - u32 conv; - }; - struct s *conversion = (struct s *) buf; - - conversion->conv = be_bswap32 (val); -} - static void drbg_add_buf (unsigned char *dst, size_t dstlen, unsigned char *add, size_t addlen) @@ -802,10 +781,10 @@ drbg_ctr_df (drbg_state_t drbg, unsigned char *df_data, /* 10.4.2 step 2 -- calculate the entire length of all input data */ for (; NULL != tempstr; tempstr = tempstr->next) inputlen += tempstr->len; - drbg_cpu_to_be32 (inputlen, &L_N[0]); + buf_put_be32 (&L_N[0], inputlen); /* 10.4.2 step 3 */ - drbg_cpu_to_be32 (bytes_to_return, &L_N[4]); + buf_put_be32 (&L_N[4], bytes_to_return); /* 10.4.2 step 5: length is size of L_N, input_string, one byte, padding */ padlen = (inputlen + sizeof (L_N) + 1) % (drbg_blocklen (drbg)); @@ -838,7 +817,7 @@ drbg_ctr_df (drbg_state_t drbg, unsigned char *df_data, /* 10.4.2 step 9.1 - the padding is implicit as the buffer * holds zeros after allocation -- even the increment of i * is irrelevant as the increment remains within length of i */ - drbg_cpu_to_be32 (i, iv); + buf_put_be32 (iv, i); /* 10.4.2 step 9.2 -- BCC and concatenation with temp */ ret = drbg_ctr_bcc (drbg, temp + templen, K, &S1); if (ret) @@ -1137,7 +1116,7 @@ drbg_hash_df (drbg_state_t drbg, /* 10.4.1 step 3 */ input[0] = 1; - drbg_cpu_to_be32 ((outlen * 8), &input[1]); + buf_put_be32 (&input[1], (outlen * 8)); /* 10.4.1 step 4.1 -- concatenation of data for input into hash */ drbg_string_fill (&data1, input, 5); ----------------------------------------------------------------------- Summary of changes: .gitignore | 1 + cipher/camellia-glue.c | 18 +-- cipher/cipher-internal.h | 36 ++--- cipher/cipher-ocb.c | 273 ++++++++++++++++++++++++++----------- cipher/rijndael-aesni.c | 96 ++----------- cipher/rijndael-armv8-aarch32-ce.S | 98 +++++++++++-- cipher/rijndael-armv8-aarch64-ce.S | 125 +++++++++++------ cipher/rijndael-armv8-ce.c | 137 +++---------------- cipher/rijndael-ssse3-amd64.c | 96 +------------ cipher/rijndael.c | 6 +- cipher/serpent.c | 24 +--- cipher/twofish.c | 20 +-- configure.ac | 1 + random/random-drbg.c | 31 +---- src/hwfeatures.c | 16 +-- tests/Makefile.am | 5 +- tests/basic-disable-all-hwf.in | 4 + tests/hashtest-256g.in | 2 +- 18 files changed, 453 insertions(+), 536 deletions(-) create mode 100644 tests/basic-disable-all-hwf.in hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From io at adamliter.org Sun Dec 11 02:08:52 2016 From: io at adamliter.org (Adam Liter) Date: Sat, 10 Dec 2016 20:08:52 -0500 Subject: Possible bug: unable to lock memory with libgcrypt 1.7.4 on macOS Sierra 10.12.1 Message-ID: <6721E761-A480-4FE5-A57E-0998BA6C0C49@adamliter.org> Hello, I think there might be a bug with libgrcypt 1.7.4 with regard to locking memory on macOS Sierra 10.12.1. Today, the package manager Homebrew bumped to libgcrypt version 1.7.4 (see here: https://github.com/Homebrew/homebrew-core/commit/06820e6fb69114fe33b06a2b2b571f73bb828caf) After updating my installed packages with Homebrew, I'm no longer able to use gpg2 with --require-secmem (even if I make the gpg2 binary have the setuid root bit flipped, as suggested here: https://lists.gnupg.org/pipermail/gnupg-users/1999-August/004024.html): ``` $ /usr/local/Cellar/gnupg2/2.0.30_2/bin/gpg2 --version gpg (GnuPG) 2.0.30 libgcrypt 1.7.4 Copyright (C) 2015 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Home: ~/.gnupg Supported algorithms: Pubkey: RSA, RSA, RSA, ELG, DSA Cipher: IDEA, 3DES, CAST5, BLOWFISH, AES, AES192, AES256, TWOFISH, CAMELLIA128, CAMELLIA192, CAMELLIA256 Hash: MD5, SHA1, RIPEMD160, SHA256, SHA384, SHA512, SHA224 Compression: Uncompressed, ZIP, ZLIB, BZIP2 $ /usr/local/Cellar/gnupg2/2.0.30_2/bin/gpg2 --require-secmem Warning: using insecure memory! gpg: will not run with insecure memory due to --require-secmem ``` On the other hand, I also have a gpg2 binary from the MacGPG Suite, which is linked against an older version of libgcrypt, and is able to execute when passed the --require-secmem option: ``` $ /usr/local/MacGPG2/bin/gpg2 --version gpg (GnuPG/MacGPG2) 2.0.30 libgcrypt 1.6.6 Copyright (C) 2015 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Home: ~/.gnupg Supported algorithms: Pubkey: RSA, RSA, RSA, ELG, DSA Cipher: IDEA, 3DES, CAST5, BLOWFISH, AES, AES192, AES256, TWOFISH, CAMELLIA128, CAMELLIA192, CAMELLIA256 Hash: MD5, SHA1, RIPEMD160, SHA256, SHA384, SHA512, SHA224 Compression: Uncompressed, ZIP, ZLIB, BZIP2 $ /usr/local/MacGPG2/bin/gpg2 --require-secmem gpg: Go ahead and type your message ... ^C gpg: signal Interrupt caught ... exiting ``` (You can find these same details here: http://apple.stackexchange.com/q/264350/85567) I don't really know anything about the underlying libraries, so I have no idea what the bug is, but those are my reasons for thinking that there is a bug in the new 1.7.4 version with regard to locking memory in macOS 10.12.1. Thanks for the great software! Best, Adam Liter From io at adamliter.org Sun Dec 11 02:26:12 2016 From: io at adamliter.org (Adam Liter) Date: Sat, 10 Dec 2016 20:26:12 -0500 Subject: Possible bug: unable to lock memory with libgcrypt 1.7.4 on macOS Sierra 10.12.1 In-Reply-To: <6721E761-A480-4FE5-A57E-0998BA6C0C49@adamliter.org> References: <6721E761-A480-4FE5-A57E-0998BA6C0C49@adamliter.org> Message-ID: Hmm, I wonder if this is not necessarily a bug with libgcrypt 1.7.4 but rather has something to do with how Homebrew is compiling the binary. It seems there is some sort of workaround being used in order to avoid a possible issue with Clang: https://github.com/Homebrew/homebrew-core/blob/06820e6fb69114fe33b06a2b2b571f73bb828caf/Formula/libgcrypt.rb#L33 I know absolutely nothing about the details here, so I can't say much more. But, hopefully this information is useful. Thanks again for your time! Best, Adam Liter On 10 Dec 2016, at 20:08, Adam Liter wrote: > Hello, > > I think there might be a bug with libgrcypt 1.7.4 with regard to > locking memory on macOS Sierra 10.12.1. Today, the package manager > Homebrew bumped to libgcrypt version 1.7.4 (see here: > https://github.com/Homebrew/homebrew-core/commit/06820e6fb69114fe33b06a2b2b571f73bb828caf) > > After updating my installed packages with Homebrew, I'm no longer able > to use gpg2 with --require-secmem (even if I make the gpg2 binary have > the setuid root bit flipped, as suggested here: > https://lists.gnupg.org/pipermail/gnupg-users/1999-August/004024.html): > > ``` > > $ /usr/local/Cellar/gnupg2/2.0.30_2/bin/gpg2 --version > gpg (GnuPG) 2.0.30 > libgcrypt 1.7.4 > Copyright (C) 2015 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later > > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. > > Home: ~/.gnupg > Supported algorithms: > Pubkey: RSA, RSA, RSA, ELG, DSA > Cipher: IDEA, 3DES, CAST5, BLOWFISH, AES, AES192, AES256, TWOFISH, > CAMELLIA128, CAMELLIA192, CAMELLIA256 > Hash: MD5, SHA1, RIPEMD160, SHA256, SHA384, SHA512, SHA224 > Compression: Uncompressed, ZIP, ZLIB, BZIP2 > > $ /usr/local/Cellar/gnupg2/2.0.30_2/bin/gpg2 --require-secmem > Warning: using insecure memory! > gpg: will not run with insecure memory due to --require-secmem > > ``` > > On the other hand, I also have a gpg2 binary from the MacGPG Suite, > which is linked against an older version of libgcrypt, and is able to > execute when passed the --require-secmem option: > > ``` > > $ /usr/local/MacGPG2/bin/gpg2 --version > gpg (GnuPG/MacGPG2) 2.0.30 > libgcrypt 1.6.6 > Copyright (C) 2015 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later > > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. > > Home: ~/.gnupg > Supported algorithms: > Pubkey: RSA, RSA, RSA, ELG, DSA > Cipher: IDEA, 3DES, CAST5, BLOWFISH, AES, AES192, AES256, TWOFISH, > CAMELLIA128, CAMELLIA192, CAMELLIA256 > Hash: MD5, SHA1, RIPEMD160, SHA256, SHA384, SHA512, SHA224 > Compression: Uncompressed, ZIP, ZLIB, BZIP2 > > $ /usr/local/MacGPG2/bin/gpg2 --require-secmem > gpg: Go ahead and type your message ... > ^C > gpg: signal Interrupt caught ... exiting > > ``` > > (You can find these same details here: > http://apple.stackexchange.com/q/264350/85567) > > I don't really know anything about the underlying libraries, so I have > no idea what the bug is, but those are my reasons for thinking that > there is a bug in the new 1.7.4 version with regard to locking memory > in macOS 10.12.1. > > Thanks for the great software! > > Best, > > Adam Liter From stefbon at gmail.com Sun Dec 11 18:34:28 2016 From: stefbon at gmail.com (Stef Bon) Date: Sun, 11 Dec 2016 18:34:28 +0100 Subject: Howto implement chacha20-poly1305? In-Reply-To: References: <87mvgh56re.fsf@wheatstone.g10code.de> <87mvgg2g0p.fsf@wheatstone.g10code.de> <4d2f55cc-910e-bdd4-0505-c4a5f7c3ed3d@iki.fi> Message-ID: 2016-12-04 13:29 GMT+01:00 Stef Bon : >> >> Does this help you? >> > Well it takes longer for me to implement. My client software uses a generic decrypt function which decrypts the incoming message and then compares the mac. It also is able to wait for additional chunks of data. The server sometimes sends the data not in one, but in different parts. It's complicated since chacha20-poly1305 at openssh.com does things different. For example the mac is compared when the message is still encrypted, while the "normal" order is first decrypt and then compare the mac. (which is also described in https://tools.ietf.org/html/rfc4253#section-6.4 ) Stef From cvs at cvs.gnupg.org Thu Dec 15 08:56:22 2016 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Thu, 15 Dec 2016 08:56:22 +0100 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.7.3-46-g0a90f87 Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 0a90f87799903a3fb97189ef7cba19e7b3534e1c (commit) from 92abfca650397e1f5dfa3a5c7752eb380cc94d0e (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 0a90f87799903a3fb97189ef7cba19e7b3534e1c Author: Werner Koch Date: Thu Dec 15 08:50:40 2016 +0100 Fix regression in broken mlock detection. * acinclude.m4 (GNUPG_CHECK_MLOCK): Fix typo EGAIN->EAGAIN. -- GnuPG-bug-id: 2870 Fixes-commit: 618b8978f46f4011c11512fd5f30c15e01652e2e Co-authored-by: Nicolas Porcel Signed-off-by: Werner Koch diff --git a/acinclude.m4 b/acinclude.m4 index 90b3cb9..dcdadfd 100644 --- a/acinclude.m4 +++ b/acinclude.m4 @@ -242,7 +242,7 @@ int main() pool += (pgsize - ((long int)pool % pgsize)); err = mlock( pool, 4096 ); - if( !err || errno == EPERM || errno == EGAIN) + if( !err || errno == EPERM || errno == EAGAIN) return 0; /* okay */ return 1; /* hmmm */ ----------------------------------------------------------------------- Summary of changes: acinclude.m4 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From cvs at cvs.gnupg.org Thu Dec 15 09:52:34 2016 From: cvs at cvs.gnupg.org (by Werner Koch) Date: Thu, 15 Dec 2016 09:52:34 +0100 Subject: [git] GCRYPT - branch, master, updated. libgcrypt-1.7.3-47-g0996d5f Message-ID: This is an automated email from the git hooks/post-receive script. It was generated because a ref change was pushed to the repository containing the project "The GNU crypto library". The branch, master has been updated via 0996d5f1c34a3d3012facd098a139d8abbde085f (commit) from 0a90f87799903a3fb97189ef7cba19e7b3534e1c (commit) Those revisions listed above that are new to this repository have not appeared on any other notification email; so we list those revisions in full, below. - Log ----------------------------------------------------------------- commit 0996d5f1c34a3d3012facd098a139d8abbde085f Author: Werner Koch Date: Thu Dec 15 09:49:47 2016 +0100 Add release info from 1.7.5 -- diff --git a/NEWS b/NEWS index 146e208..ef882b7 100644 --- a/NEWS +++ b/NEWS @@ -24,11 +24,19 @@ Noteworthy changes in version 1.8.0 (unreleased) [C21/A1/R_] using encrypted swap space. - * Interface changes relative to the 1.6.0 release: + * Interface changes relative to the 1.7.0 release: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ GCRYCTL_REINIT_SYSCALL_CLAMP NEW macro. +Noteworthy changes in version 1.7.5 (2016-12-15) [C21/A1/R5] +------------------------------------------------ + + * Bug fixes: + + - Fix regression in mlock detection [bug#2870]. + + Noteworthy changes in version 1.7.4 (2016-12-09) [C21/A1/R4] ------------------------------------------------ ----------------------------------------------------------------------- Summary of changes: NEWS | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) hooks/post-receive -- The GNU crypto library http://git.gnupg.org _______________________________________________ Gnupg-commits mailing list Gnupg-commits at gnupg.org http://lists.gnupg.org/mailman/listinfo/gnupg-commits From bugs at gnu.support Thu Dec 15 21:19:28 2016 From: bugs at gnu.support (Jean Louis) Date: Thu, 15 Dec 2016 23:19:28 +0300 Subject: Problems in compiling libgcrypt-1.7.5 Message-ID: <20161215201928.GA9080@protected.rcdrun.com> Hello, I have some problems in compiling the libgcrypt-1.7.5. 1) The last lines of building process is shown below. 2) It was: ./configure --prefix=/package/lib/$(basename `pwd`) --disable-padlock-support --disable-neon-support --disable-arm-crypto-support --enable-hmac-binary-check --enable-random-daemon Jean Louis ibgcrypt_la-sexp.lo libgcrypt_la-hwfeatures.lo libgcrypt_la-stdmem.lo libgcrypt_la-secmem.lo libgcrypt_la-missing-string.lo libgcrypt_la-fips.lo libgcrypt_la-hmac256.lo libgcrypt_la-context.lo hwf-x86.lo ../cipher/libcipher.la ../random/librandom.la ../mpi/libmpi.la ../compat/libcompat.la -lgpg-error libtool: link: gcc -shared -fPIC -DPIC .libs/libgcrypt_la-visibility.o .libs/libgcrypt_la-misc.o .libs/libgcrypt_la-global.o .libs/libgcrypt_la-sexp.o .libs/libgcrypt_la-hwfeatures.o .libs/libgcrypt_la-stdmem.o .libs/libgcrypt_la-secmem.o .libs/libgcrypt_la-missing-string.o .libs/libgcrypt_la-fips.o .libs/libgcrypt_la-hmac256.o .libs/libgcrypt_la-context.o .libs/hwf-x86.o -Wl,--whole-archive ../cipher/.libs/libcipher.a ../random/.libs/librandom.a ../mpi/.libs/libmpi.a ../compat/.libs/libcompat.a -Wl,--no-whole-archive /usr/lib/libgpg-error.so -O2 -Wl,--version-script=./libgcrypt.vers -Wl,-soname -Wl,libgcrypt.so.20 -o .libs/libgcrypt.so.20.1.5 libtool: link: (cd ".libs" && rm -f "libgcrypt.so.20" && ln -s "libgcrypt.so.20.1.5" "libgcrypt.so.20") libtool: link: (cd ".libs" && rm -f "libgcrypt.so" && ln -s "libgcrypt.so.20.1.5" "libgcrypt.so") libtool: link: ( cd ".libs" && rm -f "libgcrypt.la" && ln -s "../libgcrypt.la" "libgcrypt.la" ) gcc -DHAVE_CONFIG_H -I. -I.. -g -O2 -fvisibility=hidden -Wall -MT dumpsexp-dumpsexp.o -MD -MP -MF .deps/dumpsexp-dumpsexp.Tpo -c -o dumpsexp-dumpsexp.o `test -f 'dumpsexp.c' || echo './'`dumpsexp.c mv -f .deps/dumpsexp-dumpsexp.Tpo .deps/dumpsexp-dumpsexp.Po /bin/sh ../libtool --tag=CC --mode=link gcc -g -O2 -fvisibility=hidden -Wall -o dumpsexp dumpsexp-dumpsexp.o libtool: link: gcc -g -O2 -fvisibility=hidden -Wall -o dumpsexp dumpsexp-dumpsexp.o gcc -DHAVE_CONFIG_H -I. -I.. -DSTANDALONE -g -O2 -fvisibility=hidden -Wall -MT hmac256-hmac256.o -MD -MP -MF .deps/hmac256-hmac256.Tpo -c -o hmac256-hmac256.o `test -f 'hmac256.c' || echo './'`hmac256.c mv -f .deps/hmac256-hmac256.Tpo .deps/hmac256-hmac256.Po /bin/sh ../libtool --tag=CC --mode=link gcc -DSTANDALONE -g -O2 -fvisibility=hidden -Wall -o hmac256 hmac256-hmac256.o libtool: link: gcc -DSTANDALONE -g -O2 -fvisibility=hidden -Wall -o hmac256 hmac256-hmac256.o gcc -DHAVE_CONFIG_H -I. -I.. -g -O2 -fvisibility=hidden -Wall -MT mpicalc-mpicalc.o -MD -MP -MF .deps/mpicalc-mpicalc.Tpo -c -o mpicalc-mpicalc.o `test -f 'mpicalc.c' || echo './'`mpicalc.c mv -f .deps/mpicalc-mpicalc.Tpo .deps/mpicalc-mpicalc.Po /bin/sh ../libtool --tag=CC --mode=link gcc -g -O2 -fvisibility=hidden -Wall -o mpicalc mpicalc-mpicalc.o libgcrypt.la -ldl -lgpg-error libtool: link: gcc -g -O2 -fvisibility=hidden -Wall -o .libs/mpicalc mpicalc-mpicalc.o ./.libs/libgcrypt.so -ldl /usr/lib/libgpg-error.so -Wl,-rpath -Wl,/package/lib/libgcrypt-1.7.5/lib ./.libs/libgcrypt.so: undefined reference to `ath_read' ./.libs/libgcrypt.so: undefined reference to `_gcry_USE_THE_UNDERSCORED_FUNCTION' ./.libs/libgcrypt.so: undefined reference to `ath_write' collect2: error: ld returned 1 exit status make[2]: *** [Makefile:712: mpicalc] Error 1 make[2]: Leaving directory '/sources/gnu/libgcrypt-1.7.5/src' make[1]: *** [Makefile:477: all-recursive] Error 1 make[1]: Leaving directory '/sources/gnu/libgcrypt-1.7.5' make: *** [Makefile:408: all] Error 2 From gniibe at fsij.org Fri Dec 16 08:39:59 2016 From: gniibe at fsij.org (NIIBE Yutaka) Date: Fri, 16 Dec 2016 16:39:59 +0900 Subject: Possible bug: unable to lock memory with libgcrypt 1.7.4 on macOS Sierra 10.12.1 In-Reply-To: References: <6721E761-A480-4FE5-A57E-0998BA6C0C49@adamliter.org> Message-ID: <87fulouxtc.fsf@iwagami.gniibe.org> Adam Liter wrote: > Hmm, I wonder if this is not necessarily a bug with libgcrypt 1.7.4 but > rather has something to do with how Homebrew is compiling the binary. Thank you for your report. I think that it is a bug of configure script in libgcrypt for the detection if mlock function works correctly. In libgcrypt 1.7.5, which was released yesterday, it was fixed. -- From wk at gnupg.org Fri Dec 16 12:56:42 2016 From: wk at gnupg.org (Werner Koch) Date: Fri, 16 Dec 2016 12:56:42 +0100 Subject: Problems in compiling libgcrypt-1.7.5 In-Reply-To: <20161215201928.GA9080@protected.rcdrun.com> (Jean Louis's message of "Thu, 15 Dec 2016 23:19:28 +0300") References: <20161215201928.GA9080@protected.rcdrun.com> Message-ID: <87k2b03x51.fsf@wheatstone.g10code.de> On Thu, 15 Dec 2016 21:19, bugs at gnu.support said: > I have some problems in compiling the libgcrypt-1.7.5. Which OS are you using? Is that a regression from an older version of libgcrypt and if so which version worked for you? > 2) It was: ./configure --prefix=/package/lib/$(basename `pwd`) > --disable-padlock-support --disable-neon-support > --disable-arm-crypto-support --enable-hmac-binary-check > --enable-random-daemon The final output of the configure run would also be useful. Shalom-Salam, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: From fgunbin at fastmail.fm Fri Dec 16 15:04:01 2016 From: fgunbin at fastmail.fm (Filipp Gunbin) Date: Fri, 16 Dec 2016 17:04:01 +0300 Subject: [Announce] Libgcrypt 1.7.5 released In-Reply-To: <87mvfx7cfb.fsf@wheatstone.g10code.de> (Werner Koch's message of "Thu, 15 Dec 2016 10:46:00 +0100") References: <87mvfx7cfb.fsf@wheatstone.g10code.de> Message-ID: On 15/12/2016 10:46 +0100, Werner Koch wrote: > Hi! > > The GnuPG Project is pleased to announce the availability of Libgcrypt > version 1.7.5. This is a maintenace release. > ... Thanks for the release! I have trouble building libgcrypt-1.7.5: (after checkout) ./autogen.sh --force && ./configure --enable-maintainer-mode && make ... ... Making all in doc fig2dev -L eps `test -f 'libgcrypt-modules.fig' || echo './'`libgcrypt-modules.fig libgcrypt-modules.eps /bin/sh: fig2dev: command not found make[2]: *** [libgcrypt-modules.eps] Error 127 make[1]: *** [all-recursive] Error 1 I don't have fig2dev installed (and cannot install it). Maybe ./configure should check for it? Or am I missing something? Filipp From kristian.fiskerstrand at sumptuouscapital.com Fri Dec 16 16:56:49 2016 From: kristian.fiskerstrand at sumptuouscapital.com (Kristian Fiskerstrand) Date: Fri, 16 Dec 2016 16:56:49 +0100 Subject: [Announce] Libgcrypt 1.7.5 released In-Reply-To: References: <87mvfx7cfb.fsf@wheatstone.g10code.de> Message-ID: On 12/16/2016 03:04 PM, Filipp Gunbin wrote: > On 15/12/2016 10:46 +0100, Werner Koch wrote: > > > I don't have fig2dev installed (and cannot install it). Maybe > ./configure should check for it? Or am I missing something? Not sure why you can't install transfig, but have you tried --disable-doc do not build the documentation -- ---------------------------- Kristian Fiskerstrand Blog: https://blog.sumptuouscapital.com Twitter: @krifisk ---------------------------- Public OpenPGP keyblock at hkp://pool.sks-keyservers.net fpr:94CB AFDD 3034 5109 5618 35AA 0B7F 8B60 E3ED FAE3 ---------------------------- Testis unus, testis nullus A single witness is no witness -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From wk at gnupg.org Sat Dec 17 12:38:23 2016 From: wk at gnupg.org (Werner Koch) Date: Sat, 17 Dec 2016 12:38:23 +0100 Subject: [Announce] Libgcrypt 1.7.5 released In-Reply-To: (Filipp Gunbin's message of "Fri, 16 Dec 2016 17:04:01 +0300") References: <87mvfx7cfb.fsf@wheatstone.g10code.de> Message-ID: <87eg163hw0.fsf@wheatstone.g10code.de> On Fri, 16 Dec 2016 15:04, fgunbin at fastmail.fm said: > Making all in doc > fig2dev -L eps `test -f 'libgcrypt-modules.fig' || echo './'`libgcrypt-modules.fig libgcrypt-modules.eps The rendered versions of the fig sources are also distributed and in general there should be no need for make to build them. Solutions: touch doc/*.eps or install the transfig package or run configure with the option --disable-doc Shalom-Salam, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: From io at adamliter.org Sun Dec 18 19:40:40 2016 From: io at adamliter.org (Adam Liter) Date: Sun, 18 Dec 2016 13:40:40 -0500 Subject: Possible bug: unable to lock memory with libgcrypt 1.7.4 on macOS Sierra 10.12.1 In-Reply-To: <87fulouxtc.fsf@iwagami.gniibe.org> References: <6721E761-A480-4FE5-A57E-0998BA6C0C49@adamliter.org> <87fulouxtc.fsf@iwagami.gniibe.org> Message-ID: <9348C41D-484F-4782-8FC0-ACC3304DE9A1@adamliter.org> Hello, Thank you! Yes, 1.7.5 fixes this. Best, Adam On 16 Dec 2016, at 2:39, NIIBE Yutaka wrote: > Adam Liter wrote: >> Hmm, I wonder if this is not necessarily a bug with libgcrypt 1.7.4 but >> rather has something to do with how Homebrew is compiling the binary. > > Thank you for your report. > > I think that it is a bug of configure script in libgcrypt for the > detection if mlock function works correctly. > > In libgcrypt 1.7.5, which was released yesterday, it was fixed. > -- From el11151 at mail.ntua.gr Sun Dec 18 22:55:14 2016 From: el11151 at mail.ntua.gr (Kostis Andrikopoulos) Date: Sun, 18 Dec 2016 23:55:14 +0200 Subject: mpi_swap_cond: different sizes error on eddsa key generation In-Reply-To: <87twag34hb.fsf@iwagami.gniibe.org> References: <06c25709-dfa9-a965-bd6f-50da51cd2d59@fsij.org> <23536e43-7bf9-d801-2e26-26b8dd9c49ee@mail.ntua.gr> <7ac506f8-b546-4c62-a830-52bcda73b51e@mail.ntua.gr> <87twag34hb.fsf@iwagami.gniibe.org> Message-ID: Hello again, Thanks for the thorough explanation of the code related to the bug. I think i have isolated the buggy code enough to be able to reach some conclusions. To give some context, our library is a fork of the libotr library. So there is a possibility that it is not an actual bug of the gcrypt but an error in libotr (however it worked correctly with an older gcrypt version). The bug appears to be introduced when libotr sets a custom allocation handler for the secure memory. This might explain why either a->nlimbs > b->alloced or b->nlimbs > a->alloced when it shouldn't, since it might change the way those objects are stored in memory from how gcrypt excepts them to be. In any case i included a not-so-minimal testcase that might help you. I ran the code in libgcrypt version 1.7.3 and compiled with gcc -o test main.c chat_sign.c mem.c `libgcrypt-config --libs` When i run ./test the following error appears Ohhhh jeeee: mpi_swap_cond: different sizes [1] 16198 abort (core dumped) ./test When you remove the call to otrl_mem_init() and compile, the programme should finish with no errors. Hope this helps. -------------- next part -------------- A non-text attachment was scrubbed... Name: gcry.tar Type: application/x-tar Size: 20480 bytes Desc: not available URL: From fgunbin at fastmail.fm Mon Dec 19 13:55:23 2016 From: fgunbin at fastmail.fm (Filipp Gunbin) Date: Mon, 19 Dec 2016 15:55:23 +0300 Subject: [Announce] Libgcrypt 1.7.5 released In-Reply-To: (Kristian Fiskerstrand's message of "Fri, 16 Dec 2016 16:56:49 +0100") References: <87mvfx7cfb.fsf@wheatstone.g10code.de> Message-ID: On 16/12/2016 16:56 +0100, Kristian Fiskerstrand wrote: > On 12/16/2016 03:04 PM, Filipp Gunbin wrote: >> On 15/12/2016 10:46 +0100, Werner Koch wrote: >> > > >> >> I don't have fig2dev installed (and cannot install it). Maybe >> ./configure should check for it? Or am I missing something? > > Not sure why you can't install transfig, but have you tried > > --disable-doc do not build the documentation Kristian, I thought that fig2dev was part of xfig, but now I see it's not, so probably I could install it. I'm on MacOS and recently decided not to use macports and build everything by hand instead, so installing anything with "x" prefix seemed to be not easy :-) --disable-doc of course worked. Filipp From fgunbin at fastmail.fm Mon Dec 19 14:03:47 2016 From: fgunbin at fastmail.fm (Filipp Gunbin) Date: Mon, 19 Dec 2016 16:03:47 +0300 Subject: [Announce] Libgcrypt 1.7.5 released In-Reply-To: <87eg163hw0.fsf@wheatstone.g10code.de> (Werner Koch's message of "Sat, 17 Dec 2016 12:38:23 +0100") References: <87mvfx7cfb.fsf@wheatstone.g10code.de> <87eg163hw0.fsf@wheatstone.g10code.de> Message-ID: On 17/12/2016 12:38 +0100, Werner Koch wrote: > On Fri, 16 Dec 2016 15:04, fgunbin at fastmail.fm said: > >> Making all in doc >> fig2dev -L eps `test -f 'libgcrypt-modules.fig' || echo './'`libgcrypt-modules.fig libgcrypt-modules.eps > > The rendered versions of the fig sources are also distributed and in > general there should be no need for make to build them. > > Solutions: > > touch doc/*.eps > > or > > install the transfig package > > or > > run configure with the option --disable-doc Werner, Where do those rendered versions live? I can't see them in doc/ after git clone git://git.gnupg.org/libgcrypt.git git worktree add ../libgcrypt-1.7.5 libgcrypt-1.7.5 cd ../libgcrypt-1.7.5 Previously I built libgcrypt-1.7.2 - there are doc/*.{eps,pdf,png} files, but I think they were created on my machine, previous environment (macports) allowed me to build them. I'll check transfig, as I replied to Kristian, I mistakenly thought fig2dev was part of xfig and didn't want to install it. Thanks, Filipp From kristian.fiskerstrand at sumptuouscapital.com Mon Dec 19 22:39:32 2016 From: kristian.fiskerstrand at sumptuouscapital.com (Kristian Fiskerstrand) Date: Mon, 19 Dec 2016 22:39:32 +0100 Subject: [Announce] Libgcrypt 1.7.5 released In-Reply-To: References: <87mvfx7cfb.fsf@wheatstone.g10code.de> <87eg163hw0.fsf@wheatstone.g10code.de> Message-ID: <2507ef5f-327a-9f5a-0f49-d0676c02a57b@sumptuouscapital.com> On 12/19/2016 02:03 PM, Filipp Gunbin wrote: > On 17/12/2016 12:38 +0100, Werner Koch wrote: > >> On Fri, 16 Dec 2016 15:04, fgunbin at fastmail.fm said: >> >>> Making all in doc >>> fig2dev -L eps `test -f 'libgcrypt-modules.fig' || echo './'`libgcrypt-modules.fig libgcrypt-modules.eps >> >> The rendered versions of the fig sources are also distributed and in >> general there should be no need for make to build them. >> >> Solutions: ... > > Where do those rendered versions live? I can't see them in doc/ after > > git clone git://git.gnupg.org/libgcrypt.git > git worktree add ../libgcrypt-1.7.5 libgcrypt-1.7.5 > cd ../libgcrypt-1.7.5 Check a released tarball -- ---------------------------- Kristian Fiskerstrand Blog: https://blog.sumptuouscapital.com Twitter: @krifisk ---------------------------- Public OpenPGP keyblock at hkp://pool.sks-keyservers.net fpr:94CB AFDD 3034 5109 5618 35AA 0B7F 8B60 E3ED FAE3 ---------------------------- "At 18 our convictions are hills from which we look; At 45 they are caves in which we hide." (F. Scott Fitzgerald) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gniibe at fsij.org Tue Dec 20 05:53:27 2016 From: gniibe at fsij.org (NIIBE Yutaka) Date: Tue, 20 Dec 2016 13:53:27 +0900 Subject: mpi_swap_cond: different sizes error on eddsa key generation In-Reply-To: References: <06c25709-dfa9-a965-bd6f-50da51cd2d59@fsij.org> <23536e43-7bf9-d801-2e26-26b8dd9c49ee@mail.ntua.gr> <7ac506f8-b546-4c62-a830-52bcda73b51e@mail.ntua.gr> <87twag34hb.fsf@iwagami.gniibe.org> Message-ID: <87lgvbtd4o.fsf@iwagami.gniibe.org> Kostis Andrikopoulos wrote: > Thanks for the thorough explanation of the code related to the bug. I > think i have isolated the buggy code enough to be able to reach some > conclusions. Good. Thank you for the reproducible case. I think that the problem is here: > static int otrl_mem_is_secure(const void *p) > { > return 1; > } (and sharing same code for normal memory allocation and secure memmory allocation.) It breaks the assumption of normally allocated memory keeps this characteristic. From the view point of libgcrypt, suddenly, normally allocated memory becomes secure one. If such a thing could occur, we need to change libgcrypt so that _gcry_mpi_assign_limb_space should be always called with memory of no-smaller size than the original. Here is a possible change. Currently, I don't know if it's the right thing. Let us discuss. =========================================== diff --git a/mpi/mpi-mul.c b/mpi/mpi-mul.c index 4f4d709..313dc25 100644 --- a/mpi/mpi-mul.c +++ b/mpi/mpi-mul.c @@ -110,6 +110,7 @@ void _gcry_mpi_mul (gcry_mpi_t w, gcry_mpi_t u, gcry_mpi_t v) { mpi_size_t usize, vsize, wsize; + mpi_size_t w_alloced; mpi_ptr_t up, vp, wp; mpi_limb_t cy; int usign, vsign, usecure, vsecure, sign_product; @@ -142,11 +143,12 @@ _gcry_mpi_mul (gcry_mpi_t w, gcry_mpi_t u, gcry_mpi_t v) /* Ensure W has space enough to store the result. */ wsize = usize + vsize; + w_alloced = (wsize > w->alloced)? wsize : w->alloced; if ( !mpi_is_secure (w) && (mpi_is_secure (u) || mpi_is_secure (v)) ) { /* w is not allocated in secure space but u or v is. To make sure * that no temporray results are stored in w, we temporary use * a newly allocated limb space for w */ - wp = mpi_alloc_limb_space( wsize, 1 ); + wp = mpi_alloc_limb_space( w_alloced, 1 ); assign_wp = 2; /* mark it as 2 so that we can later copy it back to * mormal memory */ } @@ -190,12 +192,12 @@ _gcry_mpi_mul (gcry_mpi_t w, gcry_mpi_t u, gcry_mpi_t v) if( assign_wp ) { if (assign_wp == 2) { /* copy the temp wp from secure memory back to normal memory */ - mpi_ptr_t tmp_wp = mpi_alloc_limb_space (wsize, 0); - MPN_COPY (tmp_wp, wp, wsize); + mpi_ptr_t tmp_wp = mpi_alloc_limb_space (w_alloced, 0); + MPN_COPY (tmp_wp, wp, wsize); _gcry_mpi_free_limb_space (wp, 0); wp = tmp_wp; } - _gcry_mpi_assign_limb_space( w, wp, wsize ); + _gcry_mpi_assign_limb_space( w, wp, w_alloced ); } w->nlimbs = wsize; w->sign = sign_product; -- From marcio.barbado at bdslabs.com.br Tue Dec 20 15:57:46 2016 From: marcio.barbado at bdslabs.com.br (Marcio Barbado, Jr.) Date: Tue, 20 Dec 2016 12:57:46 -0200 Subject: Computer Science bachelor degree thesis on Libgcrypt Message-ID: <5d8f5fe810e2cf72b5ae1f9f546e2e2f@bdslabs.com.br> Hi, I'm a computer science student from Brazil, and a long time user of GnuPG. In 2017 -- last year for completion of my bachelor degree, my group is supposed to start working on a thesis, which should include algorithmic code development. After some months considering what we could do, "trying to help the GnuPG community" emerged as a worthy idea, so we're now reading the "The Ligbgcrypt Reference Manual" document in order to understand it better. Also, we're questioning how feasible that idea is for us to accomplish. And from such considerations, we first thought of analyzing eventual Libgcrypt bug tracking entries, or something like that. But even before that, we would like to know what you Libgcrypt people think of all this. Regards, and happy birthday, GnuPG! Marcio Barbado, Jr. From wk at gnupg.org Tue Dec 20 19:01:43 2016 From: wk at gnupg.org (Werner Koch) Date: Tue, 20 Dec 2016 19:01:43 +0100 Subject: mpi_swap_cond: different sizes error on eddsa key generation In-Reply-To: <87lgvbtd4o.fsf@iwagami.gniibe.org> (NIIBE Yutaka's message of "Tue, 20 Dec 2016 13:53:27 +0900") References: <06c25709-dfa9-a965-bd6f-50da51cd2d59@fsij.org> <23536e43-7bf9-d801-2e26-26b8dd9c49ee@mail.ntua.gr> <7ac506f8-b546-4c62-a830-52bcda73b51e@mail.ntua.gr> <87twag34hb.fsf@iwagami.gniibe.org> <87lgvbtd4o.fsf@iwagami.gniibe.org> Message-ID: <87eg12wkc8.fsf@wheatstone.g10code.de> On Tue, 20 Dec 2016 05:53, gniibe at fsij.org said: > If such a thing could occur, we need to change libgcrypt so that > _gcry_mpi_assign_limb_space should be always called with memory of > no-smaller size than the original. I guess that will be quite some work. For robustness this would be a good thing but it has the drawbacks a) we add new code and thus may introduce bugs b) we may need more secure memory Given that 1.7.4 enlarges the secmem as needed, it might be easier if OTR drops their own memory handler. The code is also questionable because the wiping does not work - the memset calls will be elided by the compiler. Shalom-Salam, Werner -- Die Gedanken sind frei. Ausnahmen regelt ein Bundesgesetz. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: