From johnjmar at linux.vnet.ibm.com Tue Dec 10 21:52:23 2019 From: johnjmar at linux.vnet.ibm.com (johnjmar) Date: Tue, 10 Dec 2019 14:52:23 -0600 Subject: post-quantum crypto algorithms implementation Message-ID: <2b467af7222ea21f07b0b54a464f3aee@linux.vnet.ibm.com> Hello, Are there any plans for post-quantum algorithms implementation in the library? Given the current state of quantum computing development, and (please correct me if I'm wrong) the vulnerability of public key exchange (RSA, ECDSA) given the former, I'm curious to see if anyone can share their plans. I was also looking at the following, for reference: https://pq-crystals.org/. From tianjia.zhang at linux.alibaba.com Fri Dec 20 08:20:11 2019 From: tianjia.zhang at linux.alibaba.com (Tianjia Zhang) Date: Fri, 20 Dec 2019 15:20:11 +0800 Subject: =?UTF-8?B?QWRkIGNyeXB0byBwdWJrZXkgU00y?= Message-ID: This new module implement the SM2 public key algorithm. It was published by State Encryption Management Bureau, China. List of specifications for SM2 elliptic curve public key cryptography: GM/T 0003.1-2012 GM/T 0003.2-2012 GM/T 0003.3-2012 GM/T 0003.4-2012 GM/T 0003.5-2012 IETF: https://tools.ietf.org/html/draft-shen-sm2-ecdsa-02 scctc: http://www.gmbz.org.cn/main/bzlb.html cipher/Makefile.am (EXTRA_libcipher_la_SOURCES): Add sm2.c. cipher/ecc-curves.c (domain_parms): Add sm2p256v1 for SM2. cipher/pubkey.c [USE_SM2] (pubkey_list): Add _gcry_pubkey_spec_sm2. cipher/sm2.c: New. configure.ac (available_pubkey_ciphers): Add sm2. src/cipher.h: Add declarations for SM2. src/fips.c (algos): Add GCRY_PK_SM2. src/gcrypt.h.in (gcry_pk_algos): Add algorithm ID for SM2. tests/basic.c (check_pubkey): Add test cases for SM2. tests/curves.c (N_CURVES): Update N_CURVES for SM2. Signed-off-by: Tianjia Zhang tianjia.zhang at linux.alibaba.com https://github.com/gpg/libgcrypt/pull/9 -------------- next part -------------- An HTML attachment was scrubbed... URL: From tianjia.zhang at linux.alibaba.com Fri Dec 20 08:16:32 2019 From: tianjia.zhang at linux.alibaba.com (Tianjia Zhang) Date: Fri, 20 Dec 2019 15:16:32 +0800 Subject: =?UTF-8?B?Rml4IHRocmVlIGVycm9ycyBmb3IgZWMgYWxnb3JpdGht?= Message-ID: <84280a0b-dca9-41ca-9a30-b34b18aabe66.tianjia.zhang@linux.alibaba.com> Fix three errors in EC alogrithm and mpi. https://github.com/gpg/libgcrypt/pull/8 -------------- next part -------------- An HTML attachment was scrubbed... URL: From tianjia.zhang at linux.alibaba.com Fri Dec 20 08:17:53 2019 From: tianjia.zhang at linux.alibaba.com (Tianjia Zhang) Date: Fri, 20 Dec 2019 15:17:53 +0800 Subject: =?UTF-8?B?QWRkIGNyeXB0byBwdWJrZXkgU00y?= Message-ID: This new module implement the SM2 public key algorithm. It was published by State Encryption Management Bureau, China. List of specifications for SM2 elliptic curve public key cryptography: GM/T 0003.1-2012 GM/T 0003.2-2012 GM/T 0003.3-2012 GM/T 0003.4-2012 GM/T 0003.5-2012 IETF: https://tools.ietf.org/html/draft-shen-sm2-ecdsa-02 scctc: http://www.gmbz.org.cn/main/bzlb.html cipher/Makefile.am (EXTRA_libcipher_la_SOURCES): Add sm2.c. cipher/ecc-curves.c (domain_parms): Add sm2p256v1 for SM2. cipher/pubkey.c [USE_SM2] (pubkey_list): Add _gcry_pubkey_spec_sm2. cipher/sm2.c: New. configure.ac (available_pubkey_ciphers): Add sm2. src/cipher.h: Add declarations for SM2. src/fips.c (algos): Add GCRY_PK_SM2. src/gcrypt.h.in (gcry_pk_algos): Add algorithm ID for SM2. tests/basic.c (check_pubkey): Add test cases for SM2. tests/curves.c (N_CURVES): Update N_CURVES for SM2. Signed-off-by: Tianjia Zhang tianjia.zhang at linux.alibaba.com https://github.com/gpg/libgcrypt/pull/9 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jan.bilek at eftlab.com.au Sat Dec 21 02:40:06 2019 From: jan.bilek at eftlab.com.au (Jan Bilek) Date: Sat, 21 Dec 2019 01:40:06 +0000 Subject: Disable Weak cipher check for DES KCV Message-ID: Hi, We have a problem here where I need to encrypt a block of data with zeros. <> gcry_check_version (NULL); unsigned char key[] = {0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}; unsigned char out[8]; unsigned char data[8]; gcry_error_t err = 0; gcry_cipher_hd_t hd = nullptr; err = gcry_cipher_open(&hd, GCRY_CIPHER_DES, GCRY_CIPHER_MODE_ECB, 0); //auto blklen = gcry_cipher_get_algo_blklen(GCRY_CIPHER_DES); //auto algolen = gcry_cipher_get_algo_keylen (GCRY_CIPHER_DES); err = gcry_cipher_setkey (hd, key, sizeof(key)); std::cerr << "gpg_err_code: " << gpg_err_code(err) << std::endl; std::cerr << "gpg_strerror: " << gpg_strerror(err) << std::endl; gcry_cipher_encrypt(hd, out, sizeof(key), data, 8); if (err) { std::cerr << "Failed to perform cryptography" << std::endl; std::cerr << " cipher: " << static_cast(GCRY_CIPHER_DES) << std::endl; std::cerr << " mode: " << static_cast(GCRY_CIPHER_MODE_ECB) << std::endl; //std::cerr << " keyBlock: " << BinToHex(key) << std::endl; //std::cerr << " out: " << BinToHex(out) << std::endl; //std::cerr << " data: " << BinToHex(encryptedData) << std::endl; } This blows on: gpg_err_code: 43 gpg_strerror: Weak encryption key cipher_encrypt: key not set Tracked back t in a source to libcrypt / cipher / des.c r. 1384 do_des_setkey r. 1021 is_weak_key if (is_weak_key (key)) { _gcry_burn_stack (64); return GPG_ERR_WEAK_KEY; } cipher.c r.797 rc = c->spec->setkey (&c->context.c, key, keylen, c); if (!rc) { } else c->marks.key = 0; ... then disallows weak key setting completely, resulting in a failure. This has quite an impact on multiple (still) in-use KCV operations (e.g. KCV_METHOD_VISA) where key needs to be encrypted with a zero key to get its KCV. May I propose a patch? (See in attachment). Thanks & Cheers, Jan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: change-weak-keys-for-kcv.patch Type: application/octet-stream Size: 2468 bytes Desc: change-weak-keys-for-kcv.patch URL: From tianjia.zhang at linux.alibaba.com Sun Dec 22 10:15:33 2019 From: tianjia.zhang at linux.alibaba.com (Tianjia Zhang) Date: Sun, 22 Dec 2019 17:15:33 +0800 Subject: [PATCH 3/3] ecc: Wrong flag and elements_enc fix. In-Reply-To: <20191222091533.2587-1-tianjia.zhang@linux.alibaba.com> References: <20191222091533.2587-1-tianjia.zhang@linux.alibaba.com> Message-ID: <20191222091533.2587-4-tianjia.zhang@linux.alibaba.com> * cipher/ecc.c (ecc_generate): Fix wrong flag and elements_enc. Signed-off-by: Tianjia Zhang --- cipher/ecc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/cipher/ecc.c b/cipher/ecc.c index 921510cc..10e11243 100644 --- a/cipher/ecc.c +++ b/cipher/ecc.c @@ -577,7 +577,7 @@ ecc_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) (&curve_flags, NULL, ((flags & PUBKEY_FLAG_PARAM) && (flags & PUBKEY_FLAG_EDDSA))? "(flags param eddsa)" : - ((flags & PUBKEY_FLAG_PARAM) && (flags & PUBKEY_FLAG_EDDSA))? + ((flags & PUBKEY_FLAG_PARAM) && (flags & PUBKEY_FLAG_DJB_TWEAK))? "(flags param djb-tweak)" : ((flags & PUBKEY_FLAG_PARAM))? "(flags param)" : ((flags & PUBKEY_FLAG_EDDSA))? @@ -1712,7 +1712,7 @@ gcry_pk_spec_t _gcry_pubkey_spec_ecc = GCRY_PK_ECC, { 0, 1 }, (GCRY_PK_USAGE_SIGN | GCRY_PK_USAGE_ENCR), "ECC", ecc_names, - "pabgnhq", "pabgnhqd", "sw", "rs", "pabgnhq", + "pabgnhq", "pabgnhqd", "se", "rs", "pabgnhq", ecc_generate, ecc_check_secret_key, ecc_encrypt_raw, -- 2.17.1 From tianjia.zhang at linux.alibaba.com Sun Dec 22 10:15:30 2019 From: tianjia.zhang at linux.alibaba.com (Tianjia Zhang) Date: Sun, 22 Dec 2019 17:15:30 +0800 Subject: [PATCH] Fix three errors for ec algorithm Message-ID: <20191222091533.2587-1-tianjia.zhang@linux.alibaba.com> Fix three errors in EC alogrithm and mpi. From tianjia.zhang at linux.alibaba.com Sun Dec 22 10:15:31 2019 From: tianjia.zhang at linux.alibaba.com (Tianjia Zhang) Date: Sun, 22 Dec 2019 17:15:31 +0800 Subject: [PATCH 1/3] Update .gitignore In-Reply-To: <20191222091533.2587-1-tianjia.zhang@linux.alibaba.com> References: <20191222091533.2587-1-tianjia.zhang@linux.alibaba.com> Message-ID: <20191222091533.2587-2-tianjia.zhang@linux.alibaba.com> Signed-off-by: Tianjia Zhang --- .gitignore | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/.gitignore b/.gitignore index 704d3ca0..99741c18 100644 --- a/.gitignore +++ b/.gitignore @@ -32,6 +32,8 @@ cipher/libcipher.la compat/Makefile compat/libcompat.la doc/gcrypt.info +doc/gcrypt.info-1 +doc/gcrypt.info-2 doc/stamp-vti doc/version.texi doc/Makefile @@ -65,6 +67,7 @@ src/gcrypt.h src/hmac256 src/libgcrypt-config src/libgcrypt.la +src/libgcrypt.pc src/mpicalc src/versioninfo.rc src/*.exe @@ -103,6 +106,8 @@ tests/t-lock tests/t-mpi-bit tests/t-mpi-point tests/t-sexp +tests/t-secmem +tests/t-x448 tests/tsexp tests/version tests/*.exe -- 2.17.1 From tianjia.zhang at linux.alibaba.com Sun Dec 22 10:15:32 2019 From: tianjia.zhang at linux.alibaba.com (Tianjia Zhang) Date: Sun, 22 Dec 2019 17:15:32 +0800 Subject: [PATCH 2/3] mpi: fix missing fields in an empty point and the mpi_clear requires a non-empty argument. In-Reply-To: <20191222091533.2587-1-tianjia.zhang@linux.alibaba.com> References: <20191222091533.2587-1-tianjia.zhang@linux.alibaba.com> Message-ID: <20191222091533.2587-3-tianjia.zhang@linux.alibaba.com> * mpi/ec.c (_gcry_mpi_point_set): Assign value to missing fields. The problem is triggered when using the following code by mpi_ec_get_elliptic_curve: elliptic_curve_t E; memset (&E, 0, sizeof E); mpi_point_set (&E->G, G->x, G->y, G->z); Signed-off-by: Tianjia Zhang --- mpi/ec.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/mpi/ec.c b/mpi/ec.c index d4c4f953..94d93354 100644 --- a/mpi/ec.c +++ b/mpi/ec.c @@ -224,16 +224,16 @@ _gcry_mpi_point_set (mpi_point_t point, point = mpi_point_new (0); if (x) - mpi_set (point->x, x); - else + point->x = mpi_set (point->x, x); + else if (point->x) mpi_clear (point->x); if (y) - mpi_set (point->y, y); - else + point->y = mpi_set (point->y, y); + else if (point->y) mpi_clear (point->y); if (z) - mpi_set (point->z, z); - else + point->z = mpi_set (point->z, z); + else if (point->z) mpi_clear (point->z); return point; -- 2.17.1 From tianjia.zhang at linux.alibaba.com Sun Dec 22 10:20:10 2019 From: tianjia.zhang at linux.alibaba.com (Tianjia Zhang) Date: Sun, 22 Dec 2019 17:20:10 +0800 Subject: [PATCH 1/2] ecc: Export ecc common functions In-Reply-To: <20191222092011.2758-1-tianjia.zhang@linux.alibaba.com> References: <20191222092011.2758-1-tianjia.zhang@linux.alibaba.com> Message-ID: <20191222092011.2758-2-tianjia.zhang@linux.alibaba.com> There are ecc-based public key algorithms that use these functions. such as SM2. * cipher/ecc.c: Export common functions and add '_gcry_ecc' prefix. * cipher/pubkey-internal.h: Add declarations for ecc common functions. Signed-off-by: Tianjia Zhang --- cipher/ecc.c | 37 ++++++++++++++++++------------------- cipher/pubkey-internal.h | 7 +++++++ 2 files changed, 25 insertions(+), 19 deletions(-) diff --git a/cipher/ecc.c b/cipher/ecc.c index 10e11243..3acda018 100644 --- a/cipher/ecc.c +++ b/cipher/ecc.c @@ -101,7 +101,6 @@ static void *progress_cb_data; /* Local prototypes. */ static void test_keys (mpi_ec_t ec, unsigned int nbits); static void test_ecdh_only_keys (mpi_ec_t ec, unsigned int nbits, int flags); -static unsigned int ecc_get_nbits (gcry_sexp_t parms); @@ -125,7 +124,7 @@ _gcry_register_pk_ecc_progress (void (*cb) (void *, const char *, /** - * nist_generate_key - Standard version of the ECC key generation. + * _gcry_ecc_nist_generate_key - Standard version of the ECC key generation. * @ec: Elliptic curve computation context. * @flags: Flags controlling aspects of the creation. * @r_x: On success this receives an allocated MPI with the affine @@ -140,8 +139,8 @@ _gcry_register_pk_ecc_progress (void (*cb) (void *, const char *, * * FIXME: Check whether N is needed. */ -static gpg_err_code_t -nist_generate_key (mpi_ec_t ec, int flags, +gpg_err_code_t +_gcry_ecc_nist_generate_key (mpi_ec_t ec, int flags, gcry_mpi_t *r_x, gcry_mpi_t *r_y) { mpi_point_struct Q; @@ -513,11 +512,11 @@ ecc_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) goto leave; if (ec->model == MPI_EC_MONTGOMERY) - rc = nist_generate_key (ec, flags, &Qx, NULL); + rc = _gcry_ecc_nist_generate_key (ec, flags, &Qx, NULL); else if ((flags & PUBKEY_FLAG_EDDSA)) rc = _gcry_ecc_eddsa_genkey (ec, flags); else - rc = nist_generate_key (ec, flags, &Qx, &Qy); + rc = _gcry_ecc_nist_generate_key (ec, flags, &Qx, &Qy); if (rc) goto leave; @@ -642,8 +641,8 @@ ecc_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) } -static gcry_err_code_t -ecc_check_secret_key (gcry_sexp_t keyparms) +gcry_err_code_t +_gcry_ecc_check_secret_key (gcry_sexp_t keyparms) { gcry_err_code_t rc; int flags = 0; @@ -758,7 +757,7 @@ ecc_verify (gcry_sexp_t s_sig, gcry_sexp_t s_data, gcry_sexp_t s_keyparms) int flags; _gcry_pk_util_init_encoding_ctx (&ctx, PUBKEY_OP_VERIFY, - ecc_get_nbits (s_keyparms)); + _gcry_ecc_get_nbits (s_keyparms)); /* Extract the data. */ rc = _gcry_pk_util_data_to_mpi (s_data, &data, &ctx); @@ -891,7 +890,7 @@ ecc_encrypt_raw (gcry_sexp_t *r_ciph, gcry_sexp_t s_data, gcry_sexp_t keyparms) int no_error_on_infinity; _gcry_pk_util_init_encoding_ctx (&ctx, PUBKEY_OP_ENCRYPT, - (nbits = ecc_get_nbits (keyparms))); + (nbits = _gcry_ecc_get_nbits (keyparms))); /* * Extract the key. @@ -1059,7 +1058,7 @@ ecc_decrypt_raw (gcry_sexp_t *r_plain, gcry_sexp_t s_data, gcry_sexp_t keyparms) point_init (&R); _gcry_pk_util_init_encoding_ctx (&ctx, PUBKEY_OP_DECRYPT, - (nbits = ecc_get_nbits (keyparms))); + (nbits = _gcry_ecc_get_nbits (keyparms))); /* * Extract the data. @@ -1224,8 +1223,8 @@ ecc_decrypt_raw (gcry_sexp_t *r_plain, gcry_sexp_t s_data, gcry_sexp_t keyparms) * * More parameters may be given. Either P or CURVE is needed. */ -static unsigned int -ecc_get_nbits (gcry_sexp_t parms) +unsigned int +_gcry_ecc_get_nbits (gcry_sexp_t parms) { gcry_sexp_t l1; gcry_mpi_t p; @@ -1263,8 +1262,8 @@ ecc_get_nbits (gcry_sexp_t parms) /* See rsa.c for a description of this function. */ -static gpg_err_code_t -compute_keygrip (gcry_md_hd_t md, gcry_sexp_t keyparms) +gpg_err_code_t +_gcry_ecc_compute_keygrip (gcry_md_hd_t md, gcry_sexp_t keyparms) { #define N_COMPONENTS 6 static const char names[N_COMPONENTS] = "pabgnq"; @@ -1667,7 +1666,7 @@ selftests_ecdsa (selftest_report_func_t report) } what = "key consistency"; - err = ecc_check_secret_key(skey); + err = _gcry_ecc_check_secret_key(skey); if (err) { errtxt = _gcry_strerror (err); @@ -1714,14 +1713,14 @@ gcry_pk_spec_t _gcry_pubkey_spec_ecc = "ECC", ecc_names, "pabgnhq", "pabgnhqd", "se", "rs", "pabgnhq", ecc_generate, - ecc_check_secret_key, + _gcry_ecc_check_secret_key, ecc_encrypt_raw, ecc_decrypt_raw, ecc_sign, ecc_verify, - ecc_get_nbits, + _gcry_ecc_get_nbits, run_selftests, - compute_keygrip, + _gcry_ecc_compute_keygrip, _gcry_ecc_get_curve, _gcry_ecc_get_param_sexp }; diff --git a/cipher/pubkey-internal.h b/cipher/pubkey-internal.h index d31e26f3..8c2c58e0 100644 --- a/cipher/pubkey-internal.h +++ b/cipher/pubkey-internal.h @@ -98,6 +98,13 @@ gpg_err_code_t _gcry_dsa_normalize_hash (gcry_mpi_t input, unsigned int qbits); /*-- ecc.c --*/ +gpg_err_code_t +_gcry_ecc_nist_generate_key (mpi_ec_t ec, int flags, + gcry_mpi_t *r_x, gcry_mpi_t *r_y); +gcry_err_code_t _gcry_ecc_check_secret_key (gcry_sexp_t keyparms); +unsigned int _gcry_ecc_get_nbits (gcry_sexp_t parms); +gpg_err_code_t _gcry_ecc_compute_keygrip (gcry_md_hd_t md, + gcry_sexp_t keyparms); gpg_err_code_t _gcry_pk_ecc_get_sexp (gcry_sexp_t *r_sexp, int mode, mpi_ec_t ec); -- 2.17.1 From tianjia.zhang at linux.alibaba.com Sun Dec 22 10:20:11 2019 From: tianjia.zhang at linux.alibaba.com (Tianjia Zhang) Date: Sun, 22 Dec 2019 17:20:11 +0800 Subject: [PATCH 2/2] Add crypto pubkey SM2. In-Reply-To: <20191222092011.2758-1-tianjia.zhang@linux.alibaba.com> References: <20191222092011.2758-1-tianjia.zhang@linux.alibaba.com> Message-ID: <20191222092011.2758-3-tianjia.zhang@linux.alibaba.com> * cipher/Makefile.am (EXTRA_libcipher_la_SOURCES): Add sm2.c. * cipher/ecc-curves.c (domain_parms): Add sm2p256v1 for SM2. * cipher/pubkey.c [USE_SM2] (pubkey_list): Add _gcry_pubkey_spec_sm2. * cipher/sm2.c: New. * configure.ac (available_pubkey_ciphers): Add sm2. * src/cipher.h: Add declarations for SM2. * src/fips.c (algos): Add GCRY_PK_SM2. * src/gcrypt.h.in (gcry_pk_algos): Add algorithm ID for SM2. * tests/basic.c (check_pubkey): Add test cases for SM2. * tests/curves.c (N_CURVES): Update N_CURVES for SM2. Signed-off-by: Tianjia Zhang --- cipher/Makefile.am | 1 + cipher/ecc-curves.c | 14 + cipher/pubkey.c | 3 + cipher/sm2.c | 1161 +++++++++++++++++++++++++++++++++++++++++++ configure.ac | 8 +- src/cipher.h | 1 + src/fips.c | 1 + src/gcrypt.h.in | 3 +- tests/basic.c | 130 ++++- tests/curves.c | 2 +- 10 files changed, 1315 insertions(+), 9 deletions(-) create mode 100644 cipher/sm2.c diff --git a/cipher/Makefile.am b/cipher/Makefile.am index 020a9616..2cfb29e1 100644 --- a/cipher/Makefile.am +++ b/cipher/Makefile.am @@ -91,6 +91,7 @@ EXTRA_libcipher_la_SOURCES = \ idea.c \ gost28147.c gost.h \ gostr3411-94.c \ + sm2.c \ md4.c \ md5.c \ rijndael.c rijndael-internal.h rijndael-tables.h \ diff --git a/cipher/ecc-curves.c b/cipher/ecc-curves.c index 52872c5e..1592d23a 100644 --- a/cipher/ecc-curves.c +++ b/cipher/ecc-curves.c @@ -115,6 +115,8 @@ static const struct { "secp256k1", "1.3.132.0.10" }, + { "sm2p256v1", "1.2.156.10197.1.301" }, + { NULL, NULL} }; @@ -512,6 +514,18 @@ static const ecc_domain_parms_t domain_parms[] = 1 }, + { + "sm2p256v1", 256, 0, + MPI_EC_WEIERSTRASS, ECC_DIALECT_STANDARD, + "0xfffffffeffffffffffffffffffffffffffffffff00000000ffffffffffffffff", + "0xfffffffeffffffffffffffffffffffffffffffff00000000fffffffffffffffc", + "0x28e9fa9e9d9f5e344d5a9e4bcf6509a7f39789f515ab8f92ddbcbd414d940e93", + "0xfffffffeffffffffffffffffffffffff7203df6b21c6052b53bbf40939d54123", + "0x32c4ae2c1f1981195f9904466a39c9948fe30bbff2660be1715a4589334c74c7", + "0xbc3736a2f4f6779c59bdcee36b692153d0a9877cc62a474002df32e52139f0a0", + 1 + }, + { NULL, 0, 0, 0, 0, NULL, NULL, NULL, NULL, NULL } }; diff --git a/cipher/pubkey.c b/cipher/pubkey.c index 4c07e33b..1c3836bc 100644 --- a/cipher/pubkey.c +++ b/cipher/pubkey.c @@ -47,6 +47,9 @@ static gcry_pk_spec_t * const pubkey_list[] = #endif #if USE_ELGAMAL &_gcry_pubkey_spec_elg, +#endif +#if USE_SM2 + &_gcry_pubkey_spec_sm2, #endif NULL }; diff --git a/cipher/sm2.c b/cipher/sm2.c new file mode 100644 index 00000000..8b7d6bec --- /dev/null +++ b/cipher/sm2.c @@ -0,0 +1,1161 @@ +/* sm2.c - SM2 implementation + * Copyright (C) 2019 Tianjia Zhang + * + * This file is part of Libgcrypt. + * + * Libgcrypt is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as + * published by the Free Software Foundation; either version 2.1 of + * the License, or (at your option) any later version. + * + * Libgcrypt is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this program; if not, see . + */ + +#include +#include +#include +#include +#include + +#include "g10lib.h" +#include "bithelp.h" +#include "mpi.h" +#include "cipher.h" +#include "context.h" +#include "ec-context.h" +#include "pubkey-internal.h" +#include "ecc-common.h" + +#define MPI_NBYTES(m) ((mpi_get_nbits(m) + 7) / 8) + + +static const char *sm2_names[] = + { + "sm2", + "1.2.156.10197.1.301", + NULL, + }; + + + +/********************************************* + ************** interface ****************** + *********************************************/ + +static gcry_err_code_t +sm2_generate (const gcry_sexp_t genparms, gcry_sexp_t *r_skey) +{ + gpg_err_code_t rc; + gcry_mpi_t Gx = NULL; + gcry_mpi_t Gy = NULL; + gcry_mpi_t Qx = NULL; + gcry_mpi_t Qy = NULL; + mpi_ec_t ec = NULL; + gcry_sexp_t curve_info = NULL; + gcry_sexp_t curve_flags = NULL; + gcry_mpi_t base = NULL; + gcry_mpi_t public = NULL; + int flags = 0; + + rc = _gcry_mpi_ec_internal_new (&ec, &flags, "ecgen curve", genparms, NULL); + if (rc) + goto leave; + + rc = _gcry_ecc_nist_generate_key (ec, flags, &Qx, &Qy); + if (rc) + goto leave; + + /* Copy data to the result. */ + Gx = mpi_new (0); + Gy = mpi_new (0); + if (_gcry_mpi_ec_get_affine (Gx, Gy, ec->G, ec)) + log_fatal ("ecgen: Failed to get affine coordinates for %s\n", "G"); + base = _gcry_ecc_ec2os (Gx, Gy, ec->p); + + if (!Qx) + { + Qx = mpi_new (0); + Qy = mpi_new (0); + if (_gcry_mpi_ec_get_affine (Qx, Qy, ec->Q, ec)) + log_fatal ("ecgen: Failed to get affine coordinates for %s\n", "Q"); + } + public = _gcry_ecc_ec2os (Qx, Qy, ec->p); + + if (ec->name) + { + rc = sexp_build (&curve_info, NULL, "(curve %s)", ec->name); + if (rc) + goto leave; + } + + if (flags & PUBKEY_FLAG_PARAM) + { + rc = sexp_build (&curve_flags, NULL, "(flags param)"); + if (rc) + goto leave; + } + + if ((flags & PUBKEY_FLAG_PARAM) && ec->name) + rc = sexp_build (r_skey, NULL, + "(key-data" + " (public-key" + " (sm2%S%S(p%m)(a%m)(b%m)(g%m)(n%m)(h%u)(q%m)))" + " (private-key" + " (sm2%S%S(p%m)(a%m)(b%m)(g%m)(n%m)(h%u)(q%m)(d%m)))" + " )", + curve_info, curve_flags, + ec->p, ec->a, ec->b, base, ec->n, ec->h, public, + curve_info, curve_flags, + ec->p, ec->a, ec->b, base, ec->n, ec->h, public, + ec->d); + else + rc = sexp_build (r_skey, NULL, + "(key-data" + " (public-key" + " (sm2%S%S(q%m)))" + " (private-key" + " (sm2%S%S(q%m)(d%m)))" + " )", + curve_info, curve_flags, public, + curve_info, curve_flags, public, ec->d); + if (rc) + goto leave; + + if (DBG_CIPHER) + { + log_printmpi ("ecgen result p", ec->p); + log_printmpi ("ecgen result a", ec->a); + log_printmpi ("ecgen result b", ec->b); + log_printmpi ("ecgen result G", base); + log_printmpi ("ecgen result n", ec->n); + log_debug ("ecgen result h:+%02x\n", ec->h); + log_printmpi ("ecgen result Q", public); + log_printmpi ("ecgen result d", ec->d); + } + + leave: + mpi_free (public); + mpi_free (base); + mpi_free (Gx); + mpi_free (Gy); + mpi_free (Qx); + mpi_free (Qy); + _gcry_mpi_ec_free (ec); + sexp_release (curve_flags); + sexp_release (curve_info); + return rc; +} + + +/* Key derivation function from X9.63/SECG */ +static gcry_err_code_t +kdf_x9_63 (int algo, const void *in, size_t inlen, void *out, size_t outlen) +{ + gcry_err_code_t rc; + gcry_md_hd_t hd; + int mdlen; + u32 counter = 1; + u32 counter_be; + unsigned char *dgst; + unsigned char *pout = out; + size_t rlen = outlen; + size_t len; + + rc = _gcry_md_open (&hd, algo, 0); + if (rc) + return rc; + + mdlen = _gcry_md_get_algo_dlen (algo); + + while (rlen > 0) + { + counter_be = be_bswap32 (counter); /* cpu_to_be32 */ + counter++; + + _gcry_md_write (hd, in, inlen); + _gcry_md_write (hd, &counter_be, sizeof(counter_be)); + + dgst = _gcry_md_read (hd, algo); + if (dgst == NULL) + { + rc = GPG_ERR_DIGEST_ALGO; + break; + } + + len = mdlen < rlen ? mdlen : rlen; /* min(mdlen, rlen) */ + memcpy (pout, dgst, len); + rlen -= len; + pout += len; + + _gcry_md_reset (hd); + } + + _gcry_md_close (hd); + return rc; +} + + +/* sm2_encrypt description: + * input: + * data[0] : octet string + * output: A new S-expression with the parameters: + * a: c1 : generated ephemeral public key (kG) + * b: c3 : Hash(x2 || IN || y2) + * c: c2 : cipher + * + * sm2_decrypt description: + * in contrast to encrypt + */ +static gcry_err_code_t +sm2_encrypt (gcry_sexp_t *r_ciph, gcry_sexp_t s_data, gcry_sexp_t keyparms) +{ + gcry_err_code_t rc; + struct pk_encoding_ctx ctx; + gcry_mpi_t data = NULL; + mpi_ec_t ec = NULL; + int flags = 0; + + _gcry_pk_util_init_encoding_ctx (&ctx, PUBKEY_OP_ENCRYPT, + _gcry_ecc_get_nbits (keyparms)); + + /* Extract the key. */ + rc = _gcry_mpi_ec_internal_new (&ec, &flags, "sm2_encrypt", keyparms, NULL); + if (rc) + goto leave; + + /* Extract the data. */ + rc = _gcry_pk_util_data_to_mpi (s_data, &data, &ctx); + if (rc) + goto leave; + + if (DBG_CIPHER) + log_mpidump ("sm2_encrypt data", data); + + if (!ec->p || !ec->a || !ec->b || !ec->G || !ec->n || !ec->Q) + { + rc = GPG_ERR_NO_OBJ; + goto leave; + } + + { + const int algo = GCRY_MD_SM3; + gcry_md_hd_t md = NULL; + int mdlen; + unsigned char *dgst; + gcry_mpi_t k = NULL; + mpi_point_struct kG, kP; + gcry_mpi_t x1, y1; + gcry_mpi_t x2, y2; + gcry_mpi_t x2y2 = NULL; + unsigned char *in = NULL; + unsigned int inlen; + unsigned char *raw; + unsigned int rawlen; + unsigned char *cipher = NULL; + int i; + + point_init (&kG); + point_init (&kP); + x1 = mpi_new (0); + y1 = mpi_new (0); + x2 = mpi_new (0); + y2 = mpi_new (0); + + in = _gcry_mpi_get_buffer (data, 0, &inlen, NULL); + if (!in) + { + rc = gpg_err_code_from_syserror (); + goto leave_main; + } + + cipher = xtrymalloc (inlen); + if (!cipher) + { + rc = gpg_err_code_from_syserror (); + goto leave_main; + } + + /* rand k in [1, n-1] */ + k = _gcry_dsa_gen_k (ec->n, GCRY_VERY_STRONG_RANDOM); + + /* [k]G = (x1, y1) */ + _gcry_mpi_ec_mul_point (&kG, k, ec->G, ec); + if (_gcry_mpi_ec_get_affine (x1, y1, &kG, ec)) + { + if (DBG_CIPHER) + log_debug ("Bad check: kG can not be a Point at Infinity!\n"); + rc = GPG_ERR_INV_DATA; + goto leave_main; + } + + /* [k]P = (x2, y2) */ + _gcry_mpi_ec_mul_point (&kP, k, ec->Q, ec); + if (_gcry_mpi_ec_get_affine (x2, y2, &kP, ec)) + { + rc = GPG_ERR_INV_DATA; + goto leave_main; + } + + /* t = KDF(x2 || y2, klen) */ + x2y2 = _gcry_mpi_ec_ec2os (&kP, ec); + raw = mpi_get_opaque (x2y2, &rawlen); + rawlen = (rawlen + 7) / 8; + + + /* skip the prefix '0x04' */ + raw += 1; + rawlen -= 1; + rc = kdf_x9_63 (algo, raw, rawlen, cipher, inlen); + if (rc) + goto leave_main; + + /* cipher = t xor in */ + for (i = 0; i < inlen; i++) + cipher[i] ^= in[i]; + + /* hash(x2 || IN || y2) */ + mdlen = _gcry_md_get_algo_dlen (algo); + rc = _gcry_md_open (&md, algo, 0); + if (rc) + goto leave_main; + _gcry_md_write (md, raw, MPI_NBYTES(x2)); + _gcry_md_write (md, in, inlen); + _gcry_md_write (md, raw + MPI_NBYTES(x2), MPI_NBYTES(y2)); + dgst = _gcry_md_read (md, algo); + if (dgst == NULL) + { + rc = GPG_ERR_DIGEST_ALGO; + goto leave_main; + } + + if (!rc) + { + gcry_mpi_t c1; + gcry_mpi_t c3; + gcry_mpi_t c2; + + c3 = mpi_new (0); + c2 = mpi_new (0); + + c1 = _gcry_ecc_ec2os (x1, y1, ec->p); + _gcry_mpi_set_opaque_copy (c3, dgst, mdlen * 8); + _gcry_mpi_set_opaque_copy (c2, cipher, inlen * 8); + + rc = sexp_build (r_ciph, NULL, "(enc-val(sm2(a%M)(b%M)(c%M)))", + c1, c3, c2); + + mpi_free (c1); + mpi_free (c3); + mpi_free (c2); + } + + leave_main: + _gcry_md_close (md); + mpi_free (x2y2); + mpi_free (k); + + point_free (&kG); + point_free (&kP); + mpi_free (x1); + mpi_free (y1); + mpi_free (x2); + mpi_free (y2); + + xfree (cipher); + xfree (in); + } + + leave: + _gcry_mpi_release (data); + _gcry_mpi_ec_free (ec); + _gcry_pk_util_free_encoding_ctx (&ctx); + if (DBG_CIPHER) + log_debug ("sm2_encrypt => %s\n", gpg_strerror (rc)); + return rc; +} + + +static gcry_err_code_t +sm2_decrypt (gcry_sexp_t *r_plain, gcry_sexp_t s_data, gcry_sexp_t keyparms) +{ + gcry_err_code_t rc; + struct pk_encoding_ctx ctx; + gcry_sexp_t l1 = NULL; + gcry_mpi_t data_c1 = NULL; + gcry_mpi_t data_c3 = NULL; + gcry_mpi_t data_c2 = NULL; + mpi_ec_t ec = NULL; + int flags = 0; + + _gcry_pk_util_init_encoding_ctx (&ctx, PUBKEY_OP_DECRYPT, + _gcry_ecc_get_nbits (keyparms)); + + /* extract the data */ + rc = _gcry_pk_util_preparse_encval (s_data, sm2_names, &l1, &ctx); + if (rc) + goto leave; + if (ctx.encoding != PUBKEY_ENC_UNKNOWN) + { + rc = GPG_ERR_ENCODING_PROBLEM; + goto leave; + } + + rc = sexp_extract_param (l1, NULL, "/a/b/c", &data_c1, &data_c3, &data_c2, NULL); + if (rc) + goto leave; + + /* extract the key */ + rc = _gcry_mpi_ec_internal_new (&ec, &flags, "sm2_decrypt", keyparms, NULL); + + if (!ec->p || !ec->a || !ec->b || !ec->G || !ec->n || !ec->d) + { + rc = GPG_ERR_NO_OBJ; + goto leave; + } + + { + const int algo = GCRY_MD_SM3; + gcry_md_hd_t md = NULL; + int mdlen; + unsigned char *dgst; + mpi_point_struct c1; + mpi_point_struct kP; + gcry_mpi_t x2, y2; + gcry_mpi_t x2y2 = NULL; + unsigned char *in = NULL; + unsigned int inlen; + unsigned char *plain = NULL; + unsigned char *raw; + unsigned int rawlen; + unsigned char *c3 = NULL; + unsigned int c3_len; + int i; + + point_init (&c1); + point_init (&kP); + x2 = mpi_new (0); + y2 = mpi_new (0); + + in = mpi_get_opaque (data_c2, &inlen); + inlen = (inlen + 7) / 8; + plain = xtrymalloc (inlen); + if (!plain) + { + rc = gpg_err_code_from_syserror (); + goto leave_main; + } + + rc = _gcry_ecc_os2ec (&c1, data_c1); + if (rc) + goto leave_main; + + if (!_gcry_mpi_ec_curve_point (&c1, ec)) + { + rc = GPG_ERR_INV_DATA; + goto leave_main; + } + + /* [d]C1 = (x2, y2), C1 = [k]G */ + _gcry_mpi_ec_mul_point (&kP, ec->d, &c1, ec); + if (_gcry_mpi_ec_get_affine (x2, y2, &kP, ec)) + { + rc = GPG_ERR_INV_DATA; + goto leave_main; + } + + /* t = KDF(x2 || y2, inlen) */ + x2y2 = _gcry_mpi_ec_ec2os (&kP, ec); + raw = mpi_get_opaque (x2y2, &rawlen); + rawlen = (rawlen + 7) / 8; + /* skip the prefix '0x04' */ + raw += 1; + rawlen -= 1; + rc = kdf_x9_63 (algo, raw, rawlen, plain, inlen); + if (rc) + goto leave_main; + + /* plain = C2 xor t */ + for (i = 0; i < inlen; i++) + plain[i] ^= in[i]; + + /* Hash(x2 || IN || y2) == C3 */ + mdlen = _gcry_md_get_algo_dlen (algo); + rc = _gcry_md_open (&md, algo, 0); + if (rc) + goto leave_main; + _gcry_md_write (md, raw, MPI_NBYTES(x2)); + _gcry_md_write (md, plain, inlen); + _gcry_md_write (md, raw + MPI_NBYTES(x2), MPI_NBYTES(y2)); + dgst = _gcry_md_read (md, algo); + if (dgst == NULL) + { + memset (plain, 0, inlen); + rc = GPG_ERR_DIGEST_ALGO; + goto leave_main; + } + c3 = mpi_get_opaque (data_c3, &c3_len); + c3_len = (c3_len + 7) / 8; + if (c3_len != mdlen || memcmp (dgst, c3, c3_len) != 0) + { + memset (plain, 0, inlen); + rc = GPG_ERR_INV_DATA; + goto leave_main; + } + + if (!rc) + { + gcry_mpi_t r; + + r = mpi_new (inlen * 8); + _gcry_mpi_set_buffer (r, plain, inlen, 0); + + rc = sexp_build (r_plain, NULL, "(value %m)", r); + + mpi_free (r); + } + + leave_main: + _gcry_md_close (md); + mpi_free (x2y2); + xfree (plain); + + point_free (&c1); + point_free (&kP); + mpi_free (x2); + mpi_free (y2); + } + + leave: + _gcry_mpi_release (data_c1); + _gcry_mpi_release (data_c3); + _gcry_mpi_release (data_c2); + _gcry_mpi_ec_free (ec); + sexp_release (l1); + _gcry_pk_util_free_encoding_ctx (&ctx); + if (DBG_CIPHER) + log_debug ("sm2_decrypt => %s\n", gpg_strerror (rc)); + return rc; +} + + +static gcry_err_code_t +sm2_sign (gcry_sexp_t *r_sig, gcry_sexp_t s_data, gcry_sexp_t keyparms) +{ + gcry_err_code_t rc; + struct pk_encoding_ctx ctx; + gcry_mpi_t data = NULL; + gcry_mpi_t hash = NULL; + mpi_ec_t ec = NULL; + int flags; + + _gcry_pk_util_init_encoding_ctx (&ctx, PUBKEY_OP_SIGN, 0); + + /* Extract the data */ + rc = _gcry_pk_util_data_to_mpi (s_data, &data, &ctx); + if (rc) + goto leave; + if (mpi_is_opaque(data)) + { + const void *buf; + unsigned int nbits; + buf = mpi_get_opaque (data, &nbits); + rc = _gcry_mpi_scan (&hash, GCRYMPI_FMT_USG, buf, (nbits + 7) / 8, NULL); + if (rc) + goto leave; + } + else + hash = data; + + /* Extract the key */ + rc = _gcry_mpi_ec_internal_new (&ec, &flags, "sm2_sign", keyparms, NULL); + if (rc) + goto leave; + if (!ec->p || !ec->a || !ec->b || !ec->G || !ec->n || !ec->d) + { + rc = GPG_ERR_NO_OBJ; + goto leave; + } + + { + gcry_mpi_t sig_r = NULL; + gcry_mpi_t sig_s = NULL; + gcry_mpi_t tmp = NULL; + gcry_mpi_t k = NULL; + gcry_mpi_t rk = NULL; + mpi_point_struct kG; + gcry_mpi_t x1; + + point_init (&kG); + x1 = mpi_new (0); + sig_r = mpi_new (0); + sig_s = mpi_new (0); + rk = mpi_new (0); + tmp = mpi_new (0); + + for (;;) + { + /* rand k in [1, n-1] */ + k = _gcry_dsa_gen_k (ec->n, GCRY_VERY_STRONG_RANDOM); + + /* [k]G = (x1, y1) */ + _gcry_mpi_ec_mul_point (&kG, k, ec->G, ec); + if (_gcry_mpi_ec_get_affine (x1, NULL, &kG, ec)) + { + rc = GPG_ERR_INV_DATA; + goto leave_main; + } + + /* r = (e + x1) % n */ + mpi_addm (sig_r, hash, x1, ec->n); + + /* r != 0 && r + k != n */ + if (mpi_cmp_ui (sig_r, 0) == 0) + continue; + mpi_add (rk, sig_r, k); + if (mpi_cmp (rk, ec->n) == 0) + continue; + + /* s = ((d + 1)^-1 * (k - rd)) % n */ + mpi_addm (sig_s, ec->d, GCRYMPI_CONST_ONE, ec->n); + mpi_invm (sig_s, sig_s, ec->n); + mpi_mulm (tmp, sig_r, ec->d, ec->n); + mpi_subm (tmp, k, tmp, ec->n); + mpi_mulm (sig_s, sig_s, tmp, ec->n); + + break; + } + + rc = sexp_build (r_sig, NULL, "(sig-val(sm2(r%M)(s%M)))", sig_r, sig_s); + + leave_main: + point_free (&kG); + mpi_free (x1); + mpi_free (k); + mpi_free (rk); + mpi_free (sig_r); + mpi_free (sig_s); + mpi_free (tmp); + } + + leave: + _gcry_mpi_ec_free (ec); + if (hash != data) + mpi_free (hash); + mpi_free (data); + _gcry_pk_util_free_encoding_ctx (&ctx); + if (DBG_CIPHER) + log_debug ("sm2_sign => %s\n", gpg_strerror (rc)); + return rc; +} + + +static gcry_err_code_t +sm2_verify (gcry_sexp_t s_sig, gcry_sexp_t s_data, gcry_sexp_t keyparms) +{ + gcry_err_code_t rc; + struct pk_encoding_ctx ctx; + gcry_sexp_t l1 = NULL; + gcry_mpi_t data = NULL; + gcry_mpi_t hash = NULL; + gcry_mpi_t sig_r = NULL; + gcry_mpi_t sig_s = NULL; + mpi_ec_t ec = NULL; + int sigflags; + int flags; + + _gcry_pk_util_init_encoding_ctx (&ctx, PUBKEY_OP_VERIFY, + _gcry_ecc_get_nbits (keyparms)); + + /* Extract the data */ + rc = _gcry_pk_util_data_to_mpi (s_data, &data, &ctx); + if (rc) + goto leave; + if (mpi_is_opaque (data)) + { + const void *buf; + unsigned int nbits; + buf = mpi_get_opaque (data, &nbits); + rc = _gcry_mpi_scan (&hash, GCRYMPI_FMT_USG, buf, (nbits + 7) / 8, NULL); + if (rc) + goto leave; + } + else + hash = data; + + rc = _gcry_pk_util_preparse_sigval (s_sig, sm2_names, &l1, &sigflags); + if (rc) + goto leave; + rc = sexp_extract_param (l1, NULL, "rs", &sig_r, &sig_s, NULL); + if (rc) + goto leave; + + /* Extract the key */ + rc = _gcry_mpi_ec_internal_new (&ec, &flags, "sm2_verify", keyparms, NULL); + if (rc) + goto leave; + if (!ec->p || !ec->a || !ec->b || !ec->G || !ec->n || !ec->Q) + { + rc = GPG_ERR_NO_OBJ; + goto leave; + } + + { + gcry_mpi_t t = NULL; + mpi_point_struct sG, tP; + gcry_mpi_t x1, y1; + + point_init (&sG); + point_init (&tP); + x1 = mpi_new (0); + y1 = mpi_new (0); + t = mpi_new (0); + + /* r, s in [1, n-1] */ + if (mpi_cmp_ui (sig_r, 1) < 0 || mpi_cmp (sig_r, ec->n) > 0 || + mpi_cmp_ui (sig_s, 1) < 0 || mpi_cmp (sig_s, ec->n) > 0) + { + rc = GPG_ERR_BAD_SIGNATURE; + goto leave_main; + } + + /* t = (r + s) % n, t == 0 */ + mpi_addm (t, sig_r, sig_s, ec->n); + if (mpi_cmp_ui (t, 0) == 0) + { + rc = GPG_ERR_BAD_SIGNATURE; + goto leave_main; + } + + /* sG + tP = (x1, y1) */ + _gcry_mpi_ec_mul_point (&sG, sig_s, ec->G, ec); + _gcry_mpi_ec_mul_point (&tP, t, ec->Q, ec); + _gcry_mpi_ec_add_points (&sG, &sG, &tP, ec); + if (_gcry_mpi_ec_get_affine (x1, y1, &sG, ec)) + { + rc = GPG_ERR_INV_DATA; + goto leave_main; + } + + /* R = (e + x1) % n */ + mpi_addm (t, hash, x1, ec->n); + + /* check R == r */ + if (mpi_cmp (t, sig_r)) + rc = GPG_ERR_BAD_SIGNATURE; + else + rc = 0; + + leave_main: + point_free (&sG); + point_free (&tP); + mpi_free (x1); + mpi_free (y1); + mpi_free (t); + } + + leave: + _gcry_mpi_ec_free (ec); + sexp_release (l1); + if (hash != data) + mpi_free (hash); + mpi_free (data); + _gcry_pk_util_free_encoding_ctx (&ctx); + if (DBG_CIPHER) + log_debug ("sm2_verify => %s\n", rc ? gpg_strerror (rc) : "Good"); + return rc; +} + + +static const char * +selftest_genkey (gcry_sexp_t *pkey, gcry_sexp_t *skey) +{ + const char *errtxt; + gpg_err_code_t err; + gcry_sexp_t key_spec = NULL; + gcry_sexp_t key = NULL; + gcry_sexp_t pub_key = NULL; + gcry_sexp_t sec_key = NULL; + static const char genkey[] = "(genkey (sm2 (curve sm2p256v1)))"; + unsigned char keygrip[20]; + + errtxt = "build key spec failed"; + err = sexp_sscan (&key_spec, NULL, genkey, strlen(genkey)); + if (err) + goto leave; + + errtxt = "genkey failed"; + err = _gcry_pk_genkey (&key, key_spec); + if (err) + goto leave; + + errtxt = "encrypt signature validity failed"; + pub_key = _gcry_sexp_find_token (key, "public-key", 0); + if (!pub_key) + goto leave; + sec_key = _gcry_sexp_find_token (key, "private-key", 0); + if (!sec_key) + goto leave; + + errtxt = "testkey failed"; + err = _gcry_pk_testkey (sec_key); + if (err) + goto leave; + + errtxt = "get keygrip failed"; + if (!_gcry_pk_get_keygrip (pub_key, keygrip)) + goto leave; + + *pkey = pub_key; + *skey = sec_key; + + sexp_release (key_spec); + sexp_release (key); + return NULL; + + leave: + sexp_release (key_spec); + sexp_release (key); + sexp_release (pub_key); + sexp_release (sec_key); + return errtxt; +} + + +#define SM2TEST_CURVE \ + "(p #8542D69E4C044F18E8B92435BF6FF7DE457283915C45517D722EDB8B08F1DFC3#)" \ + "(a #787968B4FA32C3FD2417842E73BBFEFF2F3C848B6831D7E0EC65228B3937E498#)" \ + "(b #63E4C6D3B23B0C849CF84241484BFE48F61D59A5B16BA06E6E12D1DA27C5249A#)" \ + "(g #04" \ + " 421DEBD61B62EAB6746434EBC3CC315E32220B3BADD50BDC4C4E6C147FEDD43D" \ + " 0680512BCBB42C07D47349D2153B70C4E5D7FDFCBFA36EA1A85841B9E46E09A2#)" \ + "(n #8542D69E4C044F18E8B92435BF6FF7DD297720630485628D5AE74EE7C32E79B7#)" \ + "(h #0000000000000000000000000000000000000000000000000000000000000001#)" + + +static const char * +selftest_encrypt (void) +{ +#define SM2TEST_PUBLIC_KEY \ + "(q #04" \ + " 435B39CCA8F3B508C1488AFC67BE491A0F7BA07E581A0E4849A5CF70628A7E0A" \ + " 75DDBA78F15FEECB4C7895E2C1CDF5FE01DEBB2CDBADF45399CCF77BBA076A42#)" + + static const char secret_key[] = + "(private-key" + " (sm2" + SM2TEST_CURVE + SM2TEST_PUBLIC_KEY + " (d #1649AB77A00637BD5E2EFE283FBF353534AA7F7CB89463F208DDBC2920BB0DA0#)" + "))"; + static const char public_key[] = + "(public-key" + " (sm2" + SM2TEST_CURVE + SM2TEST_PUBLIC_KEY + "))"; +#undef SM2TEST_PUBLIC_KEY + + static const char plain_text[] = "encryption standard"; + static const char plain_fmt[] = + "(data\n" + " (flags raw)\n" + " (hash-algo %s)\n" + " (value %m)\n" + ")"; + + const char *errtxt = NULL; + gpg_err_code_t err; + gcry_sexp_t skey = NULL; + gcry_sexp_t pkey = NULL; + gcry_mpi_t m = NULL; + gcry_mpi_t calculated_m = NULL; + gcry_sexp_t plain = NULL; + gcry_sexp_t cipher = NULL; + gcry_sexp_t result = NULL; + gcry_sexp_t l1 = NULL; + gcry_sexp_t l2 = NULL; + gcry_sexp_t a = NULL; + gcry_sexp_t b = NULL; + gcry_sexp_t c = NULL; + gcry_sexp_t value = NULL; + unsigned int inlen; + int cmp; + + errtxt = "build key failed"; + err = sexp_sscan (&skey, NULL, secret_key, strlen(secret_key)); + if (err) + goto leave; + err = sexp_sscan (&pkey, NULL, public_key, strlen(public_key)); + if (err) + goto leave; + + inlen = strlen (plain_text); + m = mpi_new (inlen * 8); + _gcry_mpi_set_buffer (m, plain_text, inlen, 0); + + err = sexp_build (&plain, NULL, plain_fmt, "sm3", m); + if (err) + { + errtxt = "build plain data failed"; + goto leave; + } + + /* encrypt with pkey */ + err = _gcry_pk_encrypt (&cipher, plain, pkey); + if (err) + { + errtxt = "encrypt failed"; + goto leave; + } + + errtxt = "encrypt signature validity failed"; + l1 = _gcry_sexp_find_token (cipher, "enc-val", 0); + if (!l1) + goto leave; + l2 = _gcry_sexp_find_token (l1, "sm2", 0); + if (!l2) + goto leave; + a = _gcry_sexp_find_token (l1, "a", 0); + if (!a) + goto leave; + b = _gcry_sexp_find_token (l1, "b", 0); + if (!a) + goto leave; + c = _gcry_sexp_find_token (l1, "c", 0); + if (!a) + goto leave; + + /* decrypt with skey */ + err = _gcry_pk_decrypt (&result, cipher, skey); + if (err) + { + errtxt = "decrypt failed"; + goto leave; + } + + errtxt = "decrypt signature validity failed"; + value = _gcry_sexp_find_token (result, "value", 0); + if (!value) + goto leave; + + calculated_m = _gcry_sexp_nth_mpi (value, 1, GCRYMPI_FMT_USG); + if (!calculated_m) + goto leave; + + cmp = _gcry_mpi_cmp (m, calculated_m); + if (cmp) + { + errtxt = "mismatch decrypt data"; + goto leave; + } + + errtxt = NULL; + + leave: + sexp_release (result); + sexp_release (l1); + sexp_release (l2); + sexp_release (a); + sexp_release (b); + sexp_release (c); + sexp_release (value); + sexp_release (cipher); + sexp_release (plain); + sexp_release (skey); + sexp_release (pkey); + mpi_free (m); + mpi_free (calculated_m); + return errtxt; +} + + +static const char * +selftest_sign (void) +{ +#define SM2TEST_PUBLIC_KEY \ + "(q #04" \ + " 0AE4C7798AA0F119471BEE11825BE46202BB79E2A5844495E97C04FF4DF2548A" \ + " 7C0240F88F1CD4E16352A73C17B7F16F07353E53A176D684A9FE0C6BB798E857#)" + + static const char secret_key[] = + "(private-key" + " (sm2" + SM2TEST_CURVE + SM2TEST_PUBLIC_KEY + " (d #128B2FA8BD433C6C068C8D803DFF79792A519A55171B1B650C23661D15897263#)" + "))"; + static const char public_key[] = + "(public-key" + " (sm2" + SM2TEST_CURVE + SM2TEST_PUBLIC_KEY + "))"; +#undef SM2TEST_PUBLIC_KEY + + static const char sample_data[] = + "(data (flags raw)" + " (hash sm3" + " #B524F552CD82B8B028476E005C377FB19A87E6FC682D48BB5D42E3D9B9EFFE76#))"; + static const char sample_data_bad[] = + "(data (flags raw)" + " (hash sm3" + " #cd85698fecab7843e09bcde2289096872345bcbcdaa8870bbef23d8a110bcd9f#))"; + static const char signature_r[] = + "40F1EC59F793D9F49E09DCEF49130D4194F79FB1EED2CAA55BACDB49C4E755D1"; + static const char signature_s[] = + "6FC6DAC32C5D5CF10C77DFB20F7C2EB667A457872FB09EC56327A67EC7DEEBE7"; + + const char *errtxt = NULL; + gcry_error_t err; + gcry_sexp_t skey = NULL; + gcry_sexp_t pkey = NULL; + gcry_sexp_t data = NULL; + gcry_sexp_t data_bad = NULL; + gcry_sexp_t sig = NULL; + gcry_sexp_t l1 = NULL; + gcry_sexp_t l2 = NULL; + gcry_sexp_t lr = NULL; + gcry_sexp_t ls = NULL; + gcry_mpi_t r = NULL; + gcry_mpi_t s = NULL; + + errtxt = "build key failed"; + err = sexp_sscan (&skey, NULL, secret_key, strlen (secret_key)); + if (err) + goto leave; + err = sexp_sscan (&pkey, NULL, public_key, strlen (public_key)); + if (err) + goto leave; + + errtxt = "build data failed"; + err = sexp_sscan (&data, NULL, sample_data, strlen (sample_data)); + if (err) + goto leave; + err = sexp_sscan (&data_bad, NULL, sample_data_bad, strlen (sample_data_bad)); + if (err) + goto leave; + /* TODO: r and s are only valid for fixed k in sm2test */ + err = _gcry_mpi_scan (&r, GCRYMPI_FMT_HEX, signature_r, 0, NULL); + if (err) + goto leave; + err = _gcry_mpi_scan (&s, GCRYMPI_FMT_HEX, signature_s, 0, NULL); + if (err) + goto leave; + + /* sign with skey */ + errtxt = "signing failed"; + err = _gcry_pk_sign (&sig, data, skey); + if (err) + goto leave; + + /* check against known signature */ + errtxt = "signature validity failed"; + l1 = _gcry_sexp_find_token (sig, "sig-val", 0); + if (!l1) + goto leave; + l2 = _gcry_sexp_find_token (l1, "sm2", 0); + if (!l2) + goto leave; + lr = _gcry_sexp_find_token (l2, "r", 0); + if (!r) + goto leave; + ls = _gcry_sexp_find_token (l2, "s", 0); + if (!s) + goto leave; + + /* verify with pkey */ + errtxt = "verify failed"; + err = _gcry_pk_verify (sig, data, pkey); + if (err) + goto leave; + + errtxt = "bad signature not detected"; + err = _gcry_pk_verify (sig, data_bad, pkey); + if (gcry_err_code (err) != GPG_ERR_BAD_SIGNATURE) + goto leave; + + errtxt = NULL; + + leave: + sexp_release (skey); + sexp_release (pkey); + sexp_release (data); + sexp_release (data_bad); + sexp_release (sig); + sexp_release (l1); + sexp_release (l2); + sexp_release (lr); + sexp_release (ls); + mpi_free (r); + mpi_free (s); + return errtxt; +} + +#undef SM2TEST_CURVE + + +static gpg_err_code_t +run_selftests (int algo, int extended, selftest_report_func_t report) +{ + const char *what; + const char *errtxt; + gcry_sexp_t pkey = NULL; + gcry_sexp_t skey = NULL; + + (void)extended; + + if (algo != GCRY_PK_SM2) + return GPG_ERR_PUBKEY_ALGO; + + what = "genkey"; + errtxt = selftest_genkey (&pkey, &skey); + if (errtxt) + goto failed; + + what = "encrypt"; + errtxt = selftest_encrypt (); + if (errtxt) + goto failed; + + what = "sign"; + errtxt = selftest_sign (); + if (errtxt) + goto failed; + + return 0; + + failed: + sexp_release (pkey); + sexp_release (skey); + if (report) + report ("pubkey", GCRY_PK_SM2, what, errtxt); + return GPG_ERR_SELFTEST_FAILED; +} + + + + +gcry_pk_spec_t _gcry_pubkey_spec_sm2 = + { + GCRY_PK_SM2, { 0, 1 }, + (GCRY_PK_USAGE_SIGN | GCRY_PK_USAGE_ENCR), + "SM2", sm2_names, + "pabgnhq", "pabgnhqd", "abc", "rs", "pabgnhq", + sm2_generate, + _gcry_ecc_check_secret_key, + sm2_encrypt, + sm2_decrypt, + sm2_sign, + sm2_verify, + _gcry_ecc_get_nbits, + run_selftests, + _gcry_ecc_compute_keygrip, + _gcry_ecc_get_curve, + _gcry_ecc_get_param_sexp + }; diff --git a/configure.ac b/configure.ac index 4d4fb49a..893ea5d3 100644 --- a/configure.ac +++ b/configure.ac @@ -209,7 +209,7 @@ available_ciphers="$available_ciphers camellia idea salsa20 gost28147 chacha20" enabled_ciphers="" # Definitions for public-key ciphers. -available_pubkey_ciphers="dsa elgamal rsa ecc" +available_pubkey_ciphers="dsa elgamal rsa ecc sm2" enabled_pubkey_ciphers="" # Definitions for message digests. @@ -2550,6 +2550,12 @@ if test "$found" = "1" ; then AC_DEFINE(USE_ECC, 1, [Defined if this module should be included]) fi +LIST_MEMBER(sm2, $enabled_pubkey_ciphers) +if test "$found" = "1" ; then + GCRYPT_PUBKEY_CIPHERS="$GCRYPT_PUBKEY_CIPHERS sm2.lo" + AC_DEFINE(USE_SM2, 1, [Defined if this module should be included]) +fi + LIST_MEMBER(crc, $enabled_digests) if test "$found" = "1" ; then GCRYPT_DIGESTS="$GCRYPT_DIGESTS crc.lo" diff --git a/src/cipher.h b/src/cipher.h index 5aac19f1..fc64f440 100644 --- a/src/cipher.h +++ b/src/cipher.h @@ -346,6 +346,7 @@ extern gcry_pk_spec_t _gcry_pubkey_spec_elg; extern gcry_pk_spec_t _gcry_pubkey_spec_elg_e; extern gcry_pk_spec_t _gcry_pubkey_spec_dsa; extern gcry_pk_spec_t _gcry_pubkey_spec_ecc; +extern gcry_pk_spec_t _gcry_pubkey_spec_sm2; #endif /*G10_CIPHER_H*/ diff --git a/src/fips.c b/src/fips.c index 1ac7f477..63b98ad0 100644 --- a/src/fips.c +++ b/src/fips.c @@ -536,6 +536,7 @@ run_pubkey_selftests (int extended) GCRY_PK_RSA, GCRY_PK_DSA, GCRY_PK_ECC, + GCRY_PK_SM2, 0 }; int idx; diff --git a/src/gcrypt.h.in b/src/gcrypt.h.in index c008f0a6..3b46c904 100644 --- a/src/gcrypt.h.in +++ b/src/gcrypt.h.in @@ -1117,7 +1117,8 @@ enum gcry_pk_algos GCRY_PK_ELG = 20, /* Elgamal */ GCRY_PK_ECDSA = 301, /* (only for external use). */ GCRY_PK_ECDH = 302, /* (only for external use). */ - GCRY_PK_EDDSA = 303 /* (only for external use). */ + GCRY_PK_EDDSA = 303, /* (only for external use). */ + GCRY_PK_SM2 = 304 /* sm2 TODO: 304 */ }; /* Flags describing usage capabilities of a PK algorithm. */ diff --git a/tests/basic.c b/tests/basic.c index 8337bcfb..9a5318e5 100644 --- a/tests/basic.c +++ b/tests/basic.c @@ -12637,6 +12637,77 @@ check_pubkey_sign_ecdsa (int n, gcry_sexp_t skey, gcry_sexp_t pkey) } +/* Test the public key sign function using the private ket SKEY. PKEY + is used for verification. This variant is only used for SM2. */ +static void +check_pubkey_sign_sm2 (int n, gcry_sexp_t skey, gcry_sexp_t pkey) +{ + gcry_error_t rc; + gcry_sexp_t sig, badhash, hash; + int dataidx; + static const char baddata[] = + "(data\n (flags raw)\n" + " (hash sm3 #11223344556677889900AABBCCDDEEFF10203041#))\n"; + static const struct + { + const char *data; + int expected_rc; + } datas[] = + { + { "(data\n (flags raw)\n" + " (hash sm3 #11223344556677889900AABBCCDDEEFF10203040#))\n", + 0 }, + { "(data\n (flags raw)\n" + " (hash sm3 #B524F552CD82B8B028476E005C377FB1" + "9A87E6FC682D48BB5D42E3D9B9EFFE76#))\n", + 0 }, + { "(data\n (flags oaep)\n" + " (hash sm3 #11223344556677889900AABBCCDDEEFF10203040#))\n", + GPG_ERR_CONFLICT }, + { "(data\n (flags )\n" + " (hash sm3 #11223344556677889900AABBCCDDEEFF10203040#))\n", + GPG_ERR_CONFLICT }, + { "(data\n (flags )\n" " (value #11223344556677889900AA#))\n", + 0 }, + { "(data\n (flags )\n" " (value #0090223344556677889900AA#))\n", + 0 }, + { "(data\n (flags raw)\n" " (value #11223344556677889900AA#))\n", + 0 }, + { NULL } + }; + + rc = gcry_sexp_sscan (&badhash, NULL, baddata, strlen (baddata)); + if (rc) + die ("converting data failed: %s\n", gpg_strerror (rc)); + + for (dataidx = 0; datas[dataidx].data; dataidx++) + { + if (verbose) + fprintf (stderr, " test %d, signature test %d (SM2)\n", + n, dataidx); + + rc = gcry_sexp_sscan (&hash, NULL, datas[dataidx].data, + strlen (datas[dataidx].data)); + if (rc) + die ("converting data failed: %s\n", gpg_strerror (rc)); + + rc = gcry_pk_sign (&sig, hash, skey); + if (gcry_err_code (rc) != datas[dataidx].expected_rc) + fail ("gcry_pk_sign failed: %s\n", gpg_strerror (rc)); + + if (!rc) + verify_one_signature (pkey, hash, badhash, sig); + + gcry_sexp_release (sig); + sig = NULL; + gcry_sexp_release (hash); + hash = NULL; + } + + gcry_sexp_release (badhash); +} + + static void check_pubkey_crypt (int n, gcry_sexp_t skey, gcry_sexp_t pkey, int algo) { @@ -12751,6 +12822,13 @@ check_pubkey_crypt (int n, gcry_sexp_t skey, gcry_sexp_t pkey, int algo) NULL, 0, GPG_ERR_CONFLICT }, + { GCRY_PK_SM2, + "(data\n (flags raw)\n (hash-algo sm3)\n" + " (value \"encryption standard\"))\n", + NULL, + 1, + 0, + 0 }, { 0, NULL } }; @@ -12904,6 +12982,8 @@ do_check_one_pubkey (int n, gcry_sexp_t skey, gcry_sexp_t pkey, { if (algo == GCRY_PK_ECDSA) check_pubkey_sign_ecdsa (n, skey, pkey); + else if (algo == GCRY_PK_SM2) + check_pubkey_sign_sm2 (n, skey, pkey); else check_pubkey_sign (n, skey, pkey, algo); } @@ -12936,16 +13016,23 @@ check_one_pubkey (int n, test_spec_pubkey_t spec) } static void -get_keys_new (gcry_sexp_t *pkey, gcry_sexp_t *skey) +get_keys_new (gcry_sexp_t *pkey, gcry_sexp_t *skey, int algo) { gcry_sexp_t key_spec, key, pub_key, sec_key; int rc; if (verbose) fprintf (stderr, " generating RSA key:"); - rc = gcry_sexp_new (&key_spec, - in_fips_mode ? "(genkey (rsa (nbits 4:2048)))" - : "(genkey (rsa (nbits 4:1024)(transient-key)))", - 0, 1); + if (algo == GCRY_PK_RSA) + rc = gcry_sexp_new (&key_spec, + in_fips_mode ? "(genkey (rsa (nbits 4:2048)))" + : "(genkey (rsa (nbits 4:1024)(transient-key)))", + 0, 1); + else if (algo == GCRY_PK_SM2) + rc = gcry_sexp_new (&key_spec, + "(genkey (sm2 (curve sm2p256v1)))", + 0, 1); + else + return; if (rc) die ("error creating S-expression: %s\n", gpg_strerror (rc)); rc = gcry_pk_genkey (&key, key_spec); @@ -12971,11 +13058,19 @@ check_one_pubkey_new (int n) { gcry_sexp_t skey, pkey; - get_keys_new (&pkey, &skey); + /* rsa */ + get_keys_new (&pkey, &skey, GCRY_PK_RSA); do_check_one_pubkey (n, skey, pkey, NULL, GCRY_PK_RSA, FLAG_SIGN | FLAG_CRYPT); gcry_sexp_release (pkey); gcry_sexp_release (skey); + + /* sm2 */ + get_keys_new (&pkey, &skey, GCRY_PK_SM2); + do_check_one_pubkey (n, skey, pkey, NULL, + GCRY_PK_SM2, FLAG_SIGN | FLAG_CRYPT); + gcry_sexp_release (pkey); + gcry_sexp_release (skey); } /* Run all tests for the public key functions. */ @@ -13272,6 +13367,29 @@ check_pubkey (void) "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" "\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" } + }, + { /* sm2test */ + GCRY_PK_SM2, FLAG_CRYPT | FLAG_SIGN, + { + "(private-key\n" + " (sm2\n" + " (curve sm2p256v1)\n" + " (q #04" + " 8759389A34AAAD07ECF4E0C8C2650A4459C8D926EE2378324E0261C52538CB47" + " 7528106B1E0B7C8DD5FF29A9C86A89065656EB33154BC0556091EF8AC9D17D78#)" + " (d #41EBDBA9C98CBECCE7249CF18BFD427FF8EA0B2FAB7B9D305D9D9BF4DB6ADFC2#)" + "))", + + "(public-key\n" + " (sm2\n" + " (curve sm2p256v1)\n" + " (q #04" + " 8759389A34AAAD07ECF4E0C8C2650A4459C8D926EE2378324E0261C52538CB47" + " 7528106B1E0B7C8DD5FF29A9C86A89065656EB33154BC0556091EF8AC9D17D78#)" + "))", + + "\xcb\x30\xc9\x71\x54\x05\xde\x05\x20\x7f" + "\xa0\x5b\xce\xb9\x0f\x9d\x03\x17\xeb\x73"} } }; int i; diff --git a/tests/curves.c b/tests/curves.c index ff244bd1..0dfa2acb 100644 --- a/tests/curves.c +++ b/tests/curves.c @@ -33,7 +33,7 @@ #include "t-common.h" /* Number of curves defined in ../cipger/ecc-curves.c */ -#define N_CURVES 25 +#define N_CURVES 26 /* A real world sample public key. */ static char const sample_key_1[] = -- 2.17.1 From tianjia.zhang at linux.alibaba.com Sun Dec 22 10:20:09 2019 From: tianjia.zhang at linux.alibaba.com (Tianjia Zhang) Date: Sun, 22 Dec 2019 17:20:09 +0800 Subject: [PATCH] Add crypto pubkey SM2 Message-ID: <20191222092011.2758-1-tianjia.zhang@linux.alibaba.com> This new module implement the SM2 public key algorithm. It was published by State Encryption Management Bureau, China. List of specifications for SM2 elliptic curve public key cryptography: * GM/T 0003.1-2012 * GM/T 0003.2-2012 * GM/T 0003.3-2012 * GM/T 0003.4-2012 * GM/T 0003.5-2012 IETF: https://tools.ietf.org/html/draft-shen-sm2-ecdsa-02 scctc: http://www.gmbz.org.cn/main/bzlb.html * cipher/Makefile.am (EXTRA_libcipher_la_SOURCES): Add sm2.c. * cipher/ecc-curves.c (domain_parms): Add sm2p256v1 for SM2. * cipher/pubkey.c [USE_SM2] (pubkey_list): Add _gcry_pubkey_spec_sm2. * cipher/sm2.c: New. * configure.ac (available_pubkey_ciphers): Add sm2. * src/cipher.h: Add declarations for SM2. * src/fips.c (algos): Add GCRY_PK_SM2. * src/gcrypt.h.in (gcry_pk_algos): Add algorithm ID for SM2. * tests/basic.c (check_pubkey): Add test cases for SM2. * tests/curves.c (N_CURVES): Update N_CURVES for SM2. Signed-off-by: Tianjia Zhang tianjia.zhang at linux.alibaba.com From jussi.kivilinna at iki.fi Sun Dec 22 12:36:13 2019 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Sun, 22 Dec 2019 13:36:13 +0200 Subject: Disable Weak cipher check for DES KCV In-Reply-To: References: Message-ID: <4e286efc-6eb0-7f8c-b1d3-36c09f5891fd@iki.fi> Hello, On 21.12.2019 3.40, Jan Bilek wrote: > Hi, > > We have a problem here where I need to encrypt a block of data with zeros. > > <> > ??gcry_check_version (NULL); > ? unsigned char key[] = {0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}; > ? unsigned char out[8]; > ? unsigned char data[8]; > ? gcry_error_t err = 0; > ? gcry_cipher_hd_t hd = nullptr; > ? err = gcry_cipher_open(&hd, GCRY_CIPHER_DES, GCRY_CIPHER_MODE_ECB, 0); > ? //auto blklen = gcry_cipher_get_algo_blklen(GCRY_CIPHER_DES); > ? //auto algolen = gcry_cipher_get_algo_keylen (GCRY_CIPHER_DES); > ? err = gcry_cipher_setkey (hd, key, sizeof(key)); > ? std::cerr << "gpg_err_code: " << gpg_err_code(err) << std::endl; > ? std::cerr << "gpg_strerror: " << gpg_strerror(err) << std::endl; > ? gcry_cipher_encrypt(hd, out, sizeof(key), data, 8); > ? if (err) { > ? ? std::cerr << "Failed to perform cryptography" << std::endl; > ? ? std::cerr << " ?cipher: ? ? " << static_cast(GCRY_CIPHER_DES) << std::endl; > ? ? std::cerr << " ?mode: ? ? ? " << static_cast(GCRY_CIPHER_MODE_ECB) << std::endl; > ? ? //std::cerr << " ?keyBlock: ? " << BinToHex(key) << std::endl; > ? ? //std::cerr << " ?out: ? ? ? ?" << BinToHex(out) << std::endl; > ? ? //std::cerr << " ?data: ? ? ? " << BinToHex(encryptedData) << std::endl; > ? } > > > This blows on: > > gpg_err_code: 43 > gpg_strerror: Weak encryption key > cipher_encrypt: key not set > > Tracked back t?in a source?to libcrypt / cipher / des.c > > r. 1384?do_des_setkey > r. 1021 is_weak_key > > ??if (is_weak_key (key)) { > ? ? _gcry_burn_stack (64); > ? ? return GPG_ERR_WEAK_KEY; > ? } > > cipher.c > r.797? > > ?rc = c->spec->setkey (&c->context.c, key, keylen, c); > ? if (!rc) { > > ??} else > ? ? c->marks.key = 0; > ? > ... then disallows weak key setting completely, resulting in a failure. > > This has quite an impact on multiple (still) in-use KCV operations (e.g. KCV_METHOD_VISA) where key needs to be encrypted with a zero key to get its KCV. I tried to find KCV specification where zero key is used to encrypt actual key as input block for KCV value, but all KCV algorithms I managed to find encrypt zero input block with the actual key as key. Can you check your documentation for KCV if zero key is really used and give pointer/link to that spec for us? -Jussi > > May I propose a patch? (See in attachment). > > Thanks & Cheers, > Jan > > _______________________________________________ > Gcrypt-devel mailing list > Gcrypt-devel at gnupg.org > http://lists.gnupg.org/mailman/listinfo/gcrypt-devel > From jussi.kivilinna at iki.fi Mon Dec 23 02:43:49 2019 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 23 Dec 2019 03:43:49 +0200 Subject: [PATCH 1/2] cipher: fix typo in error log Message-ID: <157706542983.32231.14731194519438329064.stgit@localhost6.localdomain6> * cipher/cipher.c (_gcry_cipher_encrypt): Fix log "cipher_decrypt: ..." to "cipher_encrypt: ...". -- Signed-off-by: Jussi Kivilinna --- 0 files changed diff --git a/cipher/cipher.c b/cipher/cipher.c index ab3e4240e..bd571367c 100644 --- a/cipher/cipher.c +++ b/cipher/cipher.c @@ -1125,7 +1125,7 @@ _gcry_cipher_encrypt (gcry_cipher_hd_t h, void *out, size_t outsize, if (h->mode != GCRY_CIPHER_MODE_NONE && !h->marks.key) { - log_error ("cipher_decrypt: key not set\n"); + log_error ("cipher_encrypt: key not set\n"); return GPG_ERR_MISSING_KEY; } From jussi.kivilinna at iki.fi Mon Dec 23 02:43:55 2019 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 23 Dec 2019 03:43:55 +0200 Subject: [PATCH 2/2] rijndael-ppc: fix bad register used for vector load/store assembly In-Reply-To: <157706542983.32231.14731194519438329064.stgit@localhost6.localdomain6> References: <157706542983.32231.14731194519438329064.stgit@localhost6.localdomain6> Message-ID: <157706543500.32231.4368240945643099416.stgit@localhost6.localdomain6> * cipher/rijndael-ppc.c (vec_aligned_ld, vec_load_be, vec_aligned_st) (vec_store_be): Add "r0" to clobber list for load/store instructions. -- Register r0 must not be used for RA input for vector load/store instructions as r0 is not read as register but as value '0'. Signed-off-by: Jussi Kivilinna --- 0 files changed diff --git a/cipher/rijndael-ppc.c b/cipher/rijndael-ppc.c index 7c349f8b0..48a47eddb 100644 --- a/cipher/rijndael-ppc.c +++ b/cipher/rijndael-ppc.c @@ -138,7 +138,7 @@ vec_aligned_ld(unsigned long offset, const unsigned char *ptr) __asm__ ("lvx %0,%1,%2\n\t" : "=v" (vec) : "r" (offset), "r" ((uintptr_t)ptr) - : "memory"); + : "memory", "r0"); return vec; #else return vec_vsx_ld (offset, ptr); @@ -169,7 +169,7 @@ vec_load_be(unsigned long offset, const unsigned char *ptr, __asm__ ("lxvw4x %x0,%1,%2\n\t" : "=wa" (vec) : "r" (offset), "r" ((uintptr_t)ptr) - : "memory"); + : "memory", "r0"); __asm__ ("vperm %0,%1,%1,%2\n\t" : "=v" (vec) : "v" (vec), "v" (be_bswap_const)); @@ -188,7 +188,7 @@ vec_aligned_st(block vec, unsigned long offset, unsigned char *ptr) __asm__ ("stvx %0,%1,%2\n\t" : : "v" (vec), "r" (offset), "r" ((uintptr_t)ptr) - : "memory"); + : "memory", "r0"); #else vec_vsx_st (vec, offset, ptr); #endif @@ -208,7 +208,7 @@ vec_store_be(block vec, unsigned long offset, unsigned char *ptr, __asm__ ("stxvw4x %x0,%1,%2\n\t" : : "wa" (vec), "r" (offset), "r" ((uintptr_t)ptr) - : "memory"); + : "memory", "r0"); #else (void)be_bswap_const; vec_vsx_st (vec, offset, ptr); From jussi.kivilinna at iki.fi Mon Dec 23 13:11:05 2019 From: jussi.kivilinna at iki.fi (Jussi Kivilinna) Date: Mon, 23 Dec 2019 14:11:05 +0200 Subject: [PATCH] rijndael-ppc: performance improvements Message-ID: <157710306580.27585.18342045124754428689.stgit@localhost6.localdomain6> * cipher/rijndael-ppc.c (ALIGNED_LOAD, ALIGNED_STORE, VEC_LOAD_BE) (VEC_STORE_BE): Rewrite. (VEC_BE_SWAP, VEC_LOAD_BE_NOSWAP, VEC_STORE_BE_NOSWAP): New. (PRELOAD_ROUND_KEYS, AES_ENCRYPT, AES_DECRYPT): Adjust to new input parameters for vector load macros. (ROUND_KEY_VARIABLES_ALL, PRELOAD_ROUND_KEYS_ALL) (AES_ENCRYPT_ALL): New. (vec_bswap32_const_neg): New. (vec_aligned_ld, vec_aligned_st, vec_load_be_const): Rename to... (asm_aligned_ls, asm_aligned_st, asm_load_be_const): ...these. (asm_be_swap, asm_vperm1, asm_load_be_noswap) (asm_store_be_noswap): New. (vec_add_uint128): Rename to... (asm_add_uint128): ...this. (asm_xor, asm_cipher_be, asm_cipherlast_be, asm_ncipher_be) (asm_ncipherlast_be): New inline assembly functions with volatile keyword to allow manual instruction ordering. (_gcry_aes_ppc8_setkey, aes_ppc8_prepare_decryption) (_gcry_aes_ppc8_encrypt, _gcry_aes_ppc8_decrypt) (_gcry_aes_ppc8_cfb_enc, _gcry_aes_ppc8_cbc_enc) (_gcry_aes_ppc8_ocb_auth): Update to use new&rewritten helper macros. (_gcry_aes_ppc8_cfb_dec, _gcry_aes_ppc8_cbc_dec) (_gcry_aes_ppc8_ctr_enc, _gcry_aes_ppc8_ocb_crypt) (_gcry_aes_ppc8_xts_crypt): Update to use new&rewritten helper macros; Tune 8-block parallel paths with manual instruction ordering. -- Benchmarks on POWER8 (ppc64le, ~3.8Ghz): Before: AES | nanosecs/byte mebibytes/sec cycles/byte CBC enc | 1.06 ns/B 902.2 MiB/s 4.02 c/B CBC dec | 0.208 ns/B 4585 MiB/s 0.790 c/B CFB enc | 1.06 ns/B 900.4 MiB/s 4.02 c/B CFB dec | 0.208 ns/B 4588 MiB/s 0.790 c/B CTR enc | 0.238 ns/B 4007 MiB/s 0.904 c/B CTR dec | 0.238 ns/B 4009 MiB/s 0.904 c/B XTS enc | 0.492 ns/B 1937 MiB/s 1.87 c/B XTS dec | 0.488 ns/B 1955 MiB/s 1.85 c/B OCB enc | 0.243 ns/B 3928 MiB/s 0.922 c/B OCB dec | 0.247 ns/B 3858 MiB/s 0.939 c/B OCB auth | 0.213 ns/B 4482 MiB/s 0.809 c/B After (cbc-dec & cfb-dec & xts & ocb ~6% faster, ctr ~11% faster): AES | nanosecs/byte mebibytes/sec cycles/byte CBC enc | 1.06 ns/B 902.1 MiB/s 4.02 c/B CBC dec | 0.196 ns/B 4877 MiB/s 0.743 c/B CFB enc | 1.06 ns/B 902.2 MiB/s 4.02 c/B CFB dec | 0.195 ns/B 4889 MiB/s 0.741 c/B CTR enc | 0.214 ns/B 4448 MiB/s 0.815 c/B CTR dec | 0.214 ns/B 4452 MiB/s 0.814 c/B XTS enc | 0.461 ns/B 2067 MiB/s 1.75 c/B XTS dec | 0.456 ns/B 2092 MiB/s 1.73 c/B OCB enc | 0.227 ns/B 4200 MiB/s 0.863 c/B OCB dec | 0.234 ns/B 4072 MiB/s 0.890 c/B OCB auth | 0.207 ns/B 4604 MiB/s 0.787 c/B Benchmarks on POWER9 (ppc64le, ~3.8Ghz): Before: AES | nanosecs/byte mebibytes/sec cycles/byte CBC enc | 1.04 ns/B 918.7 MiB/s 3.94 c/B CBC dec | 0.240 ns/B 3982 MiB/s 0.910 c/B CFB enc | 1.04 ns/B 917.6 MiB/s 3.95 c/B CFB dec | 0.241 ns/B 3963 MiB/s 0.914 c/B CTR enc | 0.249 ns/B 3835 MiB/s 0.945 c/B CTR dec | 0.252 ns/B 3787 MiB/s 0.957 c/B XTS enc | 0.505 ns/B 1889 MiB/s 1.92 c/B XTS dec | 0.495 ns/B 1926 MiB/s 1.88 c/B OCB enc | 0.303 ns/B 3152 MiB/s 1.15 c/B OCB dec | 0.305 ns/B 3129 MiB/s 1.16 c/B OCB auth | 0.265 ns/B 3595 MiB/s 1.01 c/B After (cbc-dec & cfb-dec ~6% faster, ctr ~11% faster, ocb ~4% faster): AES | nanosecs/byte mebibytes/sec cycles/byte CBC enc | 1.04 ns/B 917.3 MiB/s 3.95 c/B CBC dec | 0.225 ns/B 4234 MiB/s 0.856 c/B CFB enc | 1.04 ns/B 917.8 MiB/s 3.95 c/B CFB dec | 0.226 ns/B 4214 MiB/s 0.860 c/B CTR enc | 0.221 ns/B 4306 MiB/s 0.842 c/B CTR dec | 0.223 ns/B 4271 MiB/s 0.848 c/B XTS enc | 0.503 ns/B 1897 MiB/s 1.91 c/B XTS dec | 0.495 ns/B 1928 MiB/s 1.88 c/B OCB enc | 0.288 ns/B 3309 MiB/s 1.10 c/B OCB dec | 0.292 ns/B 3266 MiB/s 1.11 c/B OCB auth | 0.267 ns/B 3570 MiB/s 1.02 c/B Signed-off-by: Jussi Kivilinna --- 0 files changed diff --git a/cipher/rijndael-ppc.c b/cipher/rijndael-ppc.c index 48a47eddb..a8bcae468 100644 --- a/cipher/rijndael-ppc.c +++ b/cipher/rijndael-ppc.c @@ -51,17 +51,27 @@ typedef union #define ASM_FUNC_ATTR_NOINLINE ASM_FUNC_ATTR NO_INLINE -#define ALIGNED_LOAD(in_ptr) \ - (vec_aligned_ld (0, (const unsigned char *)(in_ptr))) +#define ALIGNED_LOAD(in_ptr, offs) \ + (asm_aligned_ld ((offs) * 16, (const void *)(in_ptr))) -#define ALIGNED_STORE(out_ptr, vec) \ - (vec_aligned_st ((vec), 0, (unsigned char *)(out_ptr))) +#define ALIGNED_STORE(out_ptr, offs, vec) \ + (asm_aligned_st ((vec), (offs) * 16, (void *)(out_ptr))) -#define VEC_LOAD_BE(in_ptr, bige_const) \ - (vec_load_be (0, (const unsigned char *)(in_ptr), bige_const)) +#define VEC_BE_SWAP(vec, bige_const) (asm_be_swap ((vec), (bige_const))) -#define VEC_STORE_BE(out_ptr, vec, bige_const) \ - (vec_store_be ((vec), 0, (unsigned char *)(out_ptr), bige_const)) +#define VEC_LOAD_BE(in_ptr, offs, bige_const) \ + (asm_be_swap (asm_load_be_noswap ((offs) * 16, (const void *)(in_ptr)), \ + bige_const)) + +#define VEC_LOAD_BE_NOSWAP(in_ptr, offs) \ + (asm_load_be_noswap ((offs) * 16, (const unsigned char *)(in_ptr))) + +#define VEC_STORE_BE(out_ptr, offs, vec, bige_const) \ + (asm_store_be_noswap (asm_be_swap ((vec), (bige_const)), (offs) * 16, \ + (void *)(out_ptr))) + +#define VEC_STORE_BE_NOSWAP(out_ptr, offs, vec) \ + (asm_store_be_noswap ((vec), (offs) * 16, (void *)(out_ptr))) #define ROUND_KEY_VARIABLES \ @@ -69,166 +79,257 @@ typedef union #define PRELOAD_ROUND_KEYS(nrounds) \ do { \ - rkey0 = ALIGNED_LOAD(&rk[0]); \ - rkeylast = ALIGNED_LOAD(&rk[nrounds]); \ + rkey0 = ALIGNED_LOAD (rk, 0); \ + rkeylast = ALIGNED_LOAD (rk, nrounds); \ } while (0) - #define AES_ENCRYPT(blk, nrounds) \ do { \ blk ^= rkey0; \ - blk = vec_cipher_be (blk, ALIGNED_LOAD(&rk[1])); \ - blk = vec_cipher_be (blk, ALIGNED_LOAD(&rk[2])); \ - blk = vec_cipher_be (blk, ALIGNED_LOAD(&rk[3])); \ - blk = vec_cipher_be (blk, ALIGNED_LOAD(&rk[4])); \ - blk = vec_cipher_be (blk, ALIGNED_LOAD(&rk[5])); \ - blk = vec_cipher_be (blk, ALIGNED_LOAD(&rk[6])); \ - blk = vec_cipher_be (blk, ALIGNED_LOAD(&rk[7])); \ - blk = vec_cipher_be (blk, ALIGNED_LOAD(&rk[8])); \ - blk = vec_cipher_be (blk, ALIGNED_LOAD(&rk[9])); \ + blk = asm_cipher_be (blk, ALIGNED_LOAD (rk, 1)); \ + blk = asm_cipher_be (blk, ALIGNED_LOAD (rk, 2)); \ + blk = asm_cipher_be (blk, ALIGNED_LOAD (rk, 3)); \ + blk = asm_cipher_be (blk, ALIGNED_LOAD (rk, 4)); \ + blk = asm_cipher_be (blk, ALIGNED_LOAD (rk, 5)); \ + blk = asm_cipher_be (blk, ALIGNED_LOAD (rk, 6)); \ + blk = asm_cipher_be (blk, ALIGNED_LOAD (rk, 7)); \ + blk = asm_cipher_be (blk, ALIGNED_LOAD (rk, 8)); \ + blk = asm_cipher_be (blk, ALIGNED_LOAD (rk, 9)); \ if (nrounds >= 12) \ { \ - blk = vec_cipher_be (blk, ALIGNED_LOAD(&rk[10])); \ - blk = vec_cipher_be (blk, ALIGNED_LOAD(&rk[11])); \ + blk = asm_cipher_be (blk, ALIGNED_LOAD (rk, 10)); \ + blk = asm_cipher_be (blk, ALIGNED_LOAD (rk, 11)); \ if (rounds > 12) \ { \ - blk = vec_cipher_be (blk, ALIGNED_LOAD(&rk[12])); \ - blk = vec_cipher_be (blk, ALIGNED_LOAD(&rk[13])); \ + blk = asm_cipher_be (blk, ALIGNED_LOAD (rk, 12)); \ + blk = asm_cipher_be (blk, ALIGNED_LOAD (rk, 13)); \ } \ } \ - blk = vec_cipherlast_be (blk, rkeylast); \ + blk = asm_cipherlast_be (blk, rkeylast); \ } while (0) - #define AES_DECRYPT(blk, nrounds) \ do { \ blk ^= rkey0; \ - blk = vec_ncipher_be (blk, ALIGNED_LOAD(&rk[1])); \ - blk = vec_ncipher_be (blk, ALIGNED_LOAD(&rk[2])); \ - blk = vec_ncipher_be (blk, ALIGNED_LOAD(&rk[3])); \ - blk = vec_ncipher_be (blk, ALIGNED_LOAD(&rk[4])); \ - blk = vec_ncipher_be (blk, ALIGNED_LOAD(&rk[5])); \ - blk = vec_ncipher_be (blk, ALIGNED_LOAD(&rk[6])); \ - blk = vec_ncipher_be (blk, ALIGNED_LOAD(&rk[7])); \ - blk = vec_ncipher_be (blk, ALIGNED_LOAD(&rk[8])); \ - blk = vec_ncipher_be (blk, ALIGNED_LOAD(&rk[9])); \ + blk = asm_ncipher_be (blk, ALIGNED_LOAD (rk, 1)); \ + blk = asm_ncipher_be (blk, ALIGNED_LOAD (rk, 2)); \ + blk = asm_ncipher_be (blk, ALIGNED_LOAD (rk, 3)); \ + blk = asm_ncipher_be (blk, ALIGNED_LOAD (rk, 4)); \ + blk = asm_ncipher_be (blk, ALIGNED_LOAD (rk, 5)); \ + blk = asm_ncipher_be (blk, ALIGNED_LOAD (rk, 6)); \ + blk = asm_ncipher_be (blk, ALIGNED_LOAD (rk, 7)); \ + blk = asm_ncipher_be (blk, ALIGNED_LOAD (rk, 8)); \ + blk = asm_ncipher_be (blk, ALIGNED_LOAD (rk, 9)); \ if (nrounds >= 12) \ { \ - blk = vec_ncipher_be (blk, ALIGNED_LOAD(&rk[10])); \ - blk = vec_ncipher_be (blk, ALIGNED_LOAD(&rk[11])); \ + blk = asm_ncipher_be (blk, ALIGNED_LOAD (rk, 10)); \ + blk = asm_ncipher_be (blk, ALIGNED_LOAD (rk, 11)); \ if (rounds > 12) \ { \ - blk = vec_ncipher_be (blk, ALIGNED_LOAD(&rk[12])); \ - blk = vec_ncipher_be (blk, ALIGNED_LOAD(&rk[13])); \ + blk = asm_ncipher_be (blk, ALIGNED_LOAD (rk, 12)); \ + blk = asm_ncipher_be (blk, ALIGNED_LOAD (rk, 13)); \ } \ } \ - blk = vec_ncipherlast_be (blk, rkeylast); \ + blk = asm_ncipherlast_be (blk, rkeylast); \ } while (0) +#define ROUND_KEY_VARIABLES_ALL \ + block rkey0, rkey1, rkey2, rkey3, rkey4, rkey5, rkey6, rkey7, rkey8, \ + rkey9, rkey10, rkey11, rkey12, rkey13, rkeylast + +#define PRELOAD_ROUND_KEYS_ALL(nrounds) \ + do { \ + rkey0 = ALIGNED_LOAD (rk, 0); \ + rkey1 = ALIGNED_LOAD (rk, 1); \ + rkey2 = ALIGNED_LOAD (rk, 2); \ + rkey3 = ALIGNED_LOAD (rk, 3); \ + rkey4 = ALIGNED_LOAD (rk, 4); \ + rkey5 = ALIGNED_LOAD (rk, 5); \ + rkey6 = ALIGNED_LOAD (rk, 6); \ + rkey7 = ALIGNED_LOAD (rk, 7); \ + rkey8 = ALIGNED_LOAD (rk, 8); \ + rkey9 = ALIGNED_LOAD (rk, 9); \ + if (nrounds >= 12) \ + { \ + rkey10 = ALIGNED_LOAD (rk, 10); \ + rkey11 = ALIGNED_LOAD (rk, 11); \ + if (rounds > 12) \ + { \ + rkey12 = ALIGNED_LOAD (rk, 12); \ + rkey13 = ALIGNED_LOAD (rk, 13); \ + } \ + } \ + rkeylast = ALIGNED_LOAD (rk, nrounds); \ + } while (0) + +#define AES_ENCRYPT_ALL(blk, nrounds) \ + do { \ + blk ^= rkey0; \ + blk = asm_cipher_be (blk, rkey1); \ + blk = asm_cipher_be (blk, rkey2); \ + blk = asm_cipher_be (blk, rkey3); \ + blk = asm_cipher_be (blk, rkey4); \ + blk = asm_cipher_be (blk, rkey5); \ + blk = asm_cipher_be (blk, rkey6); \ + blk = asm_cipher_be (blk, rkey7); \ + blk = asm_cipher_be (blk, rkey8); \ + blk = asm_cipher_be (blk, rkey9); \ + if (nrounds >= 12) \ + { \ + blk = asm_cipher_be (blk, rkey10); \ + blk = asm_cipher_be (blk, rkey11); \ + if (rounds > 12) \ + { \ + blk = asm_cipher_be (blk, rkey12); \ + blk = asm_cipher_be (blk, rkey13); \ + } \ + } \ + blk = asm_cipherlast_be (blk, rkeylast); \ + } while (0) + + +#ifdef WORDS_BIGENDIAN static const block vec_bswap32_const = { 3, 2, 1, 0, 7, 6, 5, 4, 11, 10, 9, 8, 15, 14, 13, 12 }; +#else +static const block vec_bswap32_const_neg = + { ~3, ~2, ~1, ~0, ~7, ~6, ~5, ~4, ~11, ~10, ~9, ~8, ~15, ~14, ~13, ~12 }; +#endif static ASM_FUNC_ATTR_INLINE block -vec_aligned_ld(unsigned long offset, const unsigned char *ptr) +asm_aligned_ld(unsigned long offset, const void *ptr) { -#ifndef WORDS_BIGENDIAN block vec; - __asm__ ("lvx %0,%1,%2\n\t" - : "=v" (vec) - : "r" (offset), "r" ((uintptr_t)ptr) - : "memory", "r0"); + __asm__ volatile ("lvx %0,%1,%2\n\t" + : "=v" (vec) + : "r" (offset), "r" ((uintptr_t)ptr) + : "memory", "r0"); return vec; -#else - return vec_vsx_ld (offset, ptr); -#endif } +static ASM_FUNC_ATTR_INLINE void +asm_aligned_st(block vec, unsigned long offset, void *ptr) +{ + __asm__ volatile ("stvx %0,%1,%2\n\t" + : + : "v" (vec), "r" (offset), "r" ((uintptr_t)ptr) + : "memory", "r0"); +} static ASM_FUNC_ATTR_INLINE block -vec_load_be_const(void) +asm_load_be_const(void) { #ifndef WORDS_BIGENDIAN - return ~ALIGNED_LOAD(&vec_bswap32_const); + return ALIGNED_LOAD (&vec_bswap32_const_neg, 0); #else static const block vec_dummy = { 0 }; return vec_dummy; #endif } - static ASM_FUNC_ATTR_INLINE block -vec_load_be(unsigned long offset, const unsigned char *ptr, - block be_bswap_const) +asm_vperm1(block vec, block mask) { -#ifndef WORDS_BIGENDIAN - block vec; - /* GCC vec_vsx_ld is generating two instructions on little-endian. Use - * lxvw4x directly instead. */ - __asm__ ("lxvw4x %x0,%1,%2\n\t" - : "=wa" (vec) - : "r" (offset), "r" ((uintptr_t)ptr) - : "memory", "r0"); - __asm__ ("vperm %0,%1,%1,%2\n\t" - : "=v" (vec) - : "v" (vec), "v" (be_bswap_const)); - return vec; -#else - (void)be_bswap_const; - return vec_vsx_ld (offset, ptr); -#endif + block o; + __asm__ volatile ("vperm %0,%1,%1,%2\n\t" + : "=v" (o) + : "v" (vec), "v" (mask)); + return o; } - -static ASM_FUNC_ATTR_INLINE void -vec_aligned_st(block vec, unsigned long offset, unsigned char *ptr) +static ASM_FUNC_ATTR_INLINE block +asm_be_swap(block vec, block be_bswap_const) { + (void)be_bswap_const; #ifndef WORDS_BIGENDIAN - __asm__ ("stvx %0,%1,%2\n\t" - : - : "v" (vec), "r" (offset), "r" ((uintptr_t)ptr) - : "memory", "r0"); + return asm_vperm1 (vec, be_bswap_const); #else - vec_vsx_st (vec, offset, ptr); + return vec; #endif } +static ASM_FUNC_ATTR_INLINE block +asm_load_be_noswap(unsigned long offset, const void *ptr) +{ + block vec; + __asm__ volatile ("lxvw4x %x0,%1,%2\n\t" + : "=wa" (vec) + : "r" (offset), "r" ((uintptr_t)ptr) + : "memory", "r0"); + /* NOTE: vec needs to be be-swapped using 'asm_be_swap' by caller */ + return vec; +} static ASM_FUNC_ATTR_INLINE void -vec_store_be(block vec, unsigned long offset, unsigned char *ptr, - block be_bswap_const) +asm_store_be_noswap(block vec, unsigned long offset, void *ptr) { -#ifndef WORDS_BIGENDIAN - /* GCC vec_vsx_st is generating two instructions on little-endian. Use - * stxvw4x directly instead. */ - __asm__ ("vperm %0,%1,%1,%2\n\t" - : "=v" (vec) - : "v" (vec), "v" (be_bswap_const)); - __asm__ ("stxvw4x %x0,%1,%2\n\t" - : - : "wa" (vec), "r" (offset), "r" ((uintptr_t)ptr) - : "memory", "r0"); -#else - (void)be_bswap_const; - vec_vsx_st (vec, offset, ptr); -#endif + /* NOTE: vec be-swapped using 'asm_be_swap' by caller */ + __asm__ volatile ("stxvw4x %x0,%1,%2\n\t" + : + : "wa" (vec), "r" (offset), "r" ((uintptr_t)ptr) + : "memory", "r0"); } +static ASM_FUNC_ATTR_INLINE block +asm_add_uint128(block a, block b) +{ + block res; + __asm__ volatile ("vadduqm %0,%1,%2\n\t" + : "=v" (res) + : "v" (a), "v" (b)); + return res; +} static ASM_FUNC_ATTR_INLINE block -vec_add_uint128(block a, block b) +asm_xor(block a, block b) { -#if 1 block res; - /* Use assembly as GCC (v8.3) generates slow code for vec_vadduqm. */ - __asm__ ("vadduqm %0,%1,%2\n\t" - : "=v" (res) - : "v" (a), "v" (b)); + __asm__ volatile ("vxor %0,%1,%2\n\t" + : "=v" (res) + : "v" (a), "v" (b)); return res; -#else - return (block)vec_vadduqm((vector __uint128_t)a, (vector __uint128_t)b); -#endif +} + +static ASM_FUNC_ATTR_INLINE block +asm_cipher_be(block b, block rk) +{ + block o; + __asm__ volatile ("vcipher %0, %1, %2\n\t" + : "=v" (o) + : "v" (b), "v" (rk)); + return o; +} + +static ASM_FUNC_ATTR_INLINE block +asm_cipherlast_be(block b, block rk) +{ + block o; + __asm__ volatile ("vcipherlast %0, %1, %2\n\t" + : "=v" (o) + : "v" (b), "v" (rk)); + return o; +} + +static ASM_FUNC_ATTR_INLINE block +asm_ncipher_be(block b, block rk) +{ + block o; + __asm__ volatile ("vncipher %0, %1, %2\n\t" + : "=v" (o) + : "v" (b), "v" (rk)); + return o; +} + +static ASM_FUNC_ATTR_INLINE block +asm_ncipherlast_be(block b, block rk) +{ + block o; + __asm__ volatile ("vncipherlast %0, %1, %2\n\t" + : "=v" (o) + : "v" (b), "v" (rk)); + return o; } @@ -250,7 +351,7 @@ _gcry_aes_sbox4_ppc8(u32 fourbytes) void _gcry_aes_ppc8_setkey (RIJNDAEL_context *ctx, const byte *key) { - const block bige_const = vec_load_be_const(); + const block bige_const = asm_load_be_const(); union { PROPERLY_ALIGNED_TYPE dummy; @@ -345,11 +446,11 @@ _gcry_aes_ppc8_setkey (RIJNDAEL_context *ctx, const byte *key) for (r = 0; r <= rounds; r++) { #ifndef WORDS_BIGENDIAN - VEC_STORE_BE(&ekey[r], ALIGNED_LOAD(&ekey[r]), bige_const); + VEC_STORE_BE(ekey, r, ALIGNED_LOAD (ekey, r), bige_const); #else - block rvec = ALIGNED_LOAD(&ekey[r]); - ALIGNED_STORE(&ekey[r], - vec_perm(rvec, rvec, vec_bswap32_const)); + block rvec = ALIGNED_LOAD (ekey, r); + ALIGNED_STORE (ekey, r, + vec_perm(rvec, rvec, vec_bswap32_const)); (void)bige_const; #endif } @@ -378,7 +479,7 @@ aes_ppc8_prepare_decryption (RIJNDAEL_context *ctx) rr = rounds; for (r = 0, rr = rounds; r <= rounds; r++, rr--) { - ALIGNED_STORE(&dkey[r], ALIGNED_LOAD(&ekey[rr])); + ALIGNED_STORE (dkey, r, ALIGNED_LOAD (ekey, rr)); } } @@ -394,18 +495,18 @@ unsigned int _gcry_aes_ppc8_encrypt (const RIJNDAEL_context *ctx, unsigned char *out, const unsigned char *in) { - const block bige_const = vec_load_be_const(); + const block bige_const = asm_load_be_const(); const u128_t *rk = (u128_t *)&ctx->keyschenc; int rounds = ctx->rounds; ROUND_KEY_VARIABLES; block b; - b = VEC_LOAD_BE (in, bige_const); + b = VEC_LOAD_BE (in, 0, bige_const); PRELOAD_ROUND_KEYS (rounds); AES_ENCRYPT (b, rounds); - VEC_STORE_BE (out, b, bige_const); + VEC_STORE_BE (out, 0, b, bige_const); return 0; /* does not use stack */ } @@ -415,18 +516,18 @@ unsigned int _gcry_aes_ppc8_decrypt (const RIJNDAEL_context *ctx, unsigned char *out, const unsigned char *in) { - const block bige_const = vec_load_be_const(); + const block bige_const = asm_load_be_const(); const u128_t *rk = (u128_t *)&ctx->keyschdec; int rounds = ctx->rounds; ROUND_KEY_VARIABLES; block b; - b = VEC_LOAD_BE (in, bige_const); + b = VEC_LOAD_BE (in, 0, bige_const); PRELOAD_ROUND_KEYS (rounds); AES_DECRYPT (b, rounds); - VEC_STORE_BE (out, b, bige_const); + VEC_STORE_BE (out, 0, b, bige_const); return 0; /* does not use stack */ } @@ -436,41 +537,41 @@ void _gcry_aes_ppc8_cfb_enc (void *context, unsigned char *iv_arg, void *outbuf_arg, const void *inbuf_arg, size_t nblocks) { - const block bige_const = vec_load_be_const(); + const block bige_const = asm_load_be_const(); RIJNDAEL_context *ctx = context; const u128_t *rk = (u128_t *)&ctx->keyschenc; const u128_t *in = (const u128_t *)inbuf_arg; u128_t *out = (u128_t *)outbuf_arg; int rounds = ctx->rounds; - ROUND_KEY_VARIABLES; + ROUND_KEY_VARIABLES_ALL; block rkeylast_orig; block iv; - iv = VEC_LOAD_BE (iv_arg, bige_const); + iv = VEC_LOAD_BE (iv_arg, 0, bige_const); - PRELOAD_ROUND_KEYS (rounds); + PRELOAD_ROUND_KEYS_ALL (rounds); rkeylast_orig = rkeylast; for (; nblocks; nblocks--) { - rkeylast = rkeylast_orig ^ VEC_LOAD_BE (in, bige_const); + rkeylast = rkeylast_orig ^ VEC_LOAD_BE (in, 0, bige_const); - AES_ENCRYPT (iv, rounds); + AES_ENCRYPT_ALL (iv, rounds); - VEC_STORE_BE (out, iv, bige_const); + VEC_STORE_BE (out, 0, iv, bige_const); out++; in++; } - VEC_STORE_BE (iv_arg, iv, bige_const); + VEC_STORE_BE (iv_arg, 0, iv, bige_const); } void _gcry_aes_ppc8_cfb_dec (void *context, unsigned char *iv_arg, void *outbuf_arg, const void *inbuf_arg, size_t nblocks) { - const block bige_const = vec_load_be_const(); + const block bige_const = asm_load_be_const(); RIJNDAEL_context *ctx = context; const u128_t *rk = (u128_t *)&ctx->keyschenc; const u128_t *in = (const u128_t *)inbuf_arg; @@ -483,7 +584,7 @@ void _gcry_aes_ppc8_cfb_dec (void *context, unsigned char *iv_arg, block b0, b1, b2, b3, b4, b5, b6, b7; block rkey; - iv = VEC_LOAD_BE (iv_arg, bige_const); + iv = VEC_LOAD_BE (iv_arg, 0, bige_const); PRELOAD_ROUND_KEYS (rounds); rkeylast_orig = rkeylast; @@ -491,34 +592,42 @@ void _gcry_aes_ppc8_cfb_dec (void *context, unsigned char *iv_arg, for (; nblocks >= 8; nblocks -= 8) { in0 = iv; - in1 = VEC_LOAD_BE (in + 0, bige_const); - in2 = VEC_LOAD_BE (in + 1, bige_const); - in3 = VEC_LOAD_BE (in + 2, bige_const); - in4 = VEC_LOAD_BE (in + 3, bige_const); - in5 = VEC_LOAD_BE (in + 4, bige_const); - in6 = VEC_LOAD_BE (in + 5, bige_const); - in7 = VEC_LOAD_BE (in + 6, bige_const); - iv = VEC_LOAD_BE (in + 7, bige_const); - - b0 = rkey0 ^ in0; - b1 = rkey0 ^ in1; - b2 = rkey0 ^ in2; - b3 = rkey0 ^ in3; - b4 = rkey0 ^ in4; - b5 = rkey0 ^ in5; - b6 = rkey0 ^ in6; - b7 = rkey0 ^ in7; + in1 = VEC_LOAD_BE_NOSWAP (in, 0); + in2 = VEC_LOAD_BE_NOSWAP (in, 1); + in3 = VEC_LOAD_BE_NOSWAP (in, 2); + in4 = VEC_LOAD_BE_NOSWAP (in, 3); + in1 = VEC_BE_SWAP (in1, bige_const); + in2 = VEC_BE_SWAP (in2, bige_const); + in5 = VEC_LOAD_BE_NOSWAP (in, 4); + in6 = VEC_LOAD_BE_NOSWAP (in, 5); + in3 = VEC_BE_SWAP (in3, bige_const); + in4 = VEC_BE_SWAP (in4, bige_const); + in7 = VEC_LOAD_BE_NOSWAP (in, 6); + iv = VEC_LOAD_BE_NOSWAP (in, 7); + in += 8; + in5 = VEC_BE_SWAP (in5, bige_const); + in6 = VEC_BE_SWAP (in6, bige_const); + b0 = asm_xor (rkey0, in0); + b1 = asm_xor (rkey0, in1); + in7 = VEC_BE_SWAP (in7, bige_const); + iv = VEC_BE_SWAP (iv, bige_const); + b2 = asm_xor (rkey0, in2); + b3 = asm_xor (rkey0, in3); + b4 = asm_xor (rkey0, in4); + b5 = asm_xor (rkey0, in5); + b6 = asm_xor (rkey0, in6); + b7 = asm_xor (rkey0, in7); #define DO_ROUND(r) \ - rkey = ALIGNED_LOAD(&rk[r]); \ - b0 = vec_cipher_be (b0, rkey); \ - b1 = vec_cipher_be (b1, rkey); \ - b2 = vec_cipher_be (b2, rkey); \ - b3 = vec_cipher_be (b3, rkey); \ - b4 = vec_cipher_be (b4, rkey); \ - b5 = vec_cipher_be (b5, rkey); \ - b6 = vec_cipher_be (b6, rkey); \ - b7 = vec_cipher_be (b7, rkey); + rkey = ALIGNED_LOAD (rk, r); \ + b0 = asm_cipher_be (b0, rkey); \ + b1 = asm_cipher_be (b1, rkey); \ + b2 = asm_cipher_be (b2, rkey); \ + b3 = asm_cipher_be (b3, rkey); \ + b4 = asm_cipher_be (b4, rkey); \ + b5 = asm_cipher_be (b5, rkey); \ + b6 = asm_cipher_be (b6, rkey); \ + b7 = asm_cipher_be (b7, rkey); DO_ROUND(1); DO_ROUND(2); @@ -542,48 +651,60 @@ void _gcry_aes_ppc8_cfb_dec (void *context, unsigned char *iv_arg, #undef DO_ROUND - rkey = rkeylast; - b0 = vec_cipherlast_be (b0, rkey ^ in1); - b1 = vec_cipherlast_be (b1, rkey ^ in2); - b2 = vec_cipherlast_be (b2, rkey ^ in3); - b3 = vec_cipherlast_be (b3, rkey ^ in4); - b4 = vec_cipherlast_be (b4, rkey ^ in5); - b5 = vec_cipherlast_be (b5, rkey ^ in6); - b6 = vec_cipherlast_be (b6, rkey ^ in7); - b7 = vec_cipherlast_be (b7, rkey ^ iv); - - VEC_STORE_BE (out + 0, b0, bige_const); - VEC_STORE_BE (out + 1, b1, bige_const); - VEC_STORE_BE (out + 2, b2, bige_const); - VEC_STORE_BE (out + 3, b3, bige_const); - VEC_STORE_BE (out + 4, b4, bige_const); - VEC_STORE_BE (out + 5, b5, bige_const); - VEC_STORE_BE (out + 6, b6, bige_const); - VEC_STORE_BE (out + 7, b7, bige_const); - - in += 8; + in1 = asm_xor (rkeylast, in1); + in2 = asm_xor (rkeylast, in2); + in3 = asm_xor (rkeylast, in3); + in4 = asm_xor (rkeylast, in4); + b0 = asm_cipherlast_be (b0, in1); + b1 = asm_cipherlast_be (b1, in2); + in5 = asm_xor (rkeylast, in5); + in6 = asm_xor (rkeylast, in6); + b2 = asm_cipherlast_be (b2, in3); + b3 = asm_cipherlast_be (b3, in4); + in7 = asm_xor (rkeylast, in7); + in0 = asm_xor (rkeylast, iv); + b0 = VEC_BE_SWAP (b0, bige_const); + b1 = VEC_BE_SWAP (b1, bige_const); + b4 = asm_cipherlast_be (b4, in5); + b5 = asm_cipherlast_be (b5, in6); + b2 = VEC_BE_SWAP (b2, bige_const); + b3 = VEC_BE_SWAP (b3, bige_const); + b6 = asm_cipherlast_be (b6, in7); + b7 = asm_cipherlast_be (b7, in0); + b4 = VEC_BE_SWAP (b4, bige_const); + b5 = VEC_BE_SWAP (b5, bige_const); + b6 = VEC_BE_SWAP (b6, bige_const); + b7 = VEC_BE_SWAP (b7, bige_const); + VEC_STORE_BE_NOSWAP (out, 0, b0); + VEC_STORE_BE_NOSWAP (out, 1, b1); + VEC_STORE_BE_NOSWAP (out, 2, b2); + VEC_STORE_BE_NOSWAP (out, 3, b3); + VEC_STORE_BE_NOSWAP (out, 4, b4); + VEC_STORE_BE_NOSWAP (out, 5, b5); + VEC_STORE_BE_NOSWAP (out, 6, b6); + VEC_STORE_BE_NOSWAP (out, 7, b7); out += 8; } if (nblocks >= 4) { in0 = iv; - in1 = VEC_LOAD_BE (in + 0, bige_const); - in2 = VEC_LOAD_BE (in + 1, bige_const); - in3 = VEC_LOAD_BE (in + 2, bige_const); - iv = VEC_LOAD_BE (in + 3, bige_const); + in1 = VEC_LOAD_BE (in, 0, bige_const); + in2 = VEC_LOAD_BE (in, 1, bige_const); + in3 = VEC_LOAD_BE (in, 2, bige_const); + iv = VEC_LOAD_BE (in, 3, bige_const); - b0 = rkey0 ^ in0; - b1 = rkey0 ^ in1; - b2 = rkey0 ^ in2; - b3 = rkey0 ^ in3; + b0 = asm_xor (rkey0, in0); + b1 = asm_xor (rkey0, in1); + b2 = asm_xor (rkey0, in2); + b3 = asm_xor (rkey0, in3); #define DO_ROUND(r) \ - rkey = ALIGNED_LOAD(&rk[r]); \ - b0 = vec_cipher_be (b0, rkey); \ - b1 = vec_cipher_be (b1, rkey); \ - b2 = vec_cipher_be (b2, rkey); \ - b3 = vec_cipher_be (b3, rkey); + rkey = ALIGNED_LOAD (rk, r); \ + b0 = asm_cipher_be (b0, rkey); \ + b1 = asm_cipher_be (b1, rkey); \ + b2 = asm_cipher_be (b2, rkey); \ + b3 = asm_cipher_be (b3, rkey); DO_ROUND(1); DO_ROUND(2); @@ -607,16 +728,18 @@ void _gcry_aes_ppc8_cfb_dec (void *context, unsigned char *iv_arg, #undef DO_ROUND - rkey = rkeylast; - b0 = vec_cipherlast_be (b0, rkey ^ in1); - b1 = vec_cipherlast_be (b1, rkey ^ in2); - b2 = vec_cipherlast_be (b2, rkey ^ in3); - b3 = vec_cipherlast_be (b3, rkey ^ iv); - - VEC_STORE_BE (out + 0, b0, bige_const); - VEC_STORE_BE (out + 1, b1, bige_const); - VEC_STORE_BE (out + 2, b2, bige_const); - VEC_STORE_BE (out + 3, b3, bige_const); + in1 = asm_xor (rkeylast, in1); + in2 = asm_xor (rkeylast, in2); + in3 = asm_xor (rkeylast, in3); + in0 = asm_xor (rkeylast, iv); + b0 = asm_cipherlast_be (b0, in1); + b1 = asm_cipherlast_be (b1, in2); + b2 = asm_cipherlast_be (b2, in3); + b3 = asm_cipherlast_be (b3, in0); + VEC_STORE_BE (out, 0, b0, bige_const); + VEC_STORE_BE (out, 1, b1, bige_const); + VEC_STORE_BE (out, 2, b2, bige_const); + VEC_STORE_BE (out, 3, b3, bige_const); in += 4; out += 4; @@ -625,20 +748,20 @@ void _gcry_aes_ppc8_cfb_dec (void *context, unsigned char *iv_arg, for (; nblocks; nblocks--) { - bin = VEC_LOAD_BE (in, bige_const); + bin = VEC_LOAD_BE (in, 0, bige_const); rkeylast = rkeylast_orig ^ bin; b = iv; iv = bin; AES_ENCRYPT (b, rounds); - VEC_STORE_BE (out, b, bige_const); + VEC_STORE_BE (out, 0, b, bige_const); out++; in++; } - VEC_STORE_BE (iv_arg, iv, bige_const); + VEC_STORE_BE (iv_arg, 0, iv, bige_const); } @@ -646,41 +769,41 @@ void _gcry_aes_ppc8_cbc_enc (void *context, unsigned char *iv_arg, void *outbuf_arg, const void *inbuf_arg, size_t nblocks, int cbc_mac) { - const block bige_const = vec_load_be_const(); + const block bige_const = asm_load_be_const(); RIJNDAEL_context *ctx = context; const u128_t *rk = (u128_t *)&ctx->keyschenc; const u128_t *in = (const u128_t *)inbuf_arg; u128_t *out = (u128_t *)outbuf_arg; int rounds = ctx->rounds; - ROUND_KEY_VARIABLES; + ROUND_KEY_VARIABLES_ALL; block lastiv, b; + unsigned int outadd = !cbc_mac; - lastiv = VEC_LOAD_BE (iv_arg, bige_const); + lastiv = VEC_LOAD_BE (iv_arg, 0, bige_const); - PRELOAD_ROUND_KEYS (rounds); + PRELOAD_ROUND_KEYS_ALL (rounds); for (; nblocks; nblocks--) { - b = lastiv ^ VEC_LOAD_BE (in, bige_const); + b = lastiv ^ VEC_LOAD_BE (in, 0, bige_const); - AES_ENCRYPT (b, rounds); + AES_ENCRYPT_ALL (b, rounds); lastiv = b; - VEC_STORE_BE (out, b, bige_const); + VEC_STORE_BE (out, 0, b, bige_const); in++; - if (!cbc_mac) - out++; + out += outadd; } - VEC_STORE_BE (iv_arg, lastiv, bige_const); + VEC_STORE_BE (iv_arg, 0, lastiv, bige_const); } void _gcry_aes_ppc8_cbc_dec (void *context, unsigned char *iv_arg, void *outbuf_arg, const void *inbuf_arg, size_t nblocks) { - const block bige_const = vec_load_be_const(); + const block bige_const = asm_load_be_const(); RIJNDAEL_context *ctx = context; const u128_t *rk = (u128_t *)&ctx->keyschdec; const u128_t *in = (const u128_t *)inbuf_arg; @@ -699,41 +822,49 @@ void _gcry_aes_ppc8_cbc_dec (void *context, unsigned char *iv_arg, ctx->decryption_prepared = 1; } - iv = VEC_LOAD_BE (iv_arg, bige_const); + iv = VEC_LOAD_BE (iv_arg, 0, bige_const); PRELOAD_ROUND_KEYS (rounds); rkeylast_orig = rkeylast; for (; nblocks >= 8; nblocks -= 8) { - in0 = VEC_LOAD_BE (in + 0, bige_const); - in1 = VEC_LOAD_BE (in + 1, bige_const); - in2 = VEC_LOAD_BE (in + 2, bige_const); - in3 = VEC_LOAD_BE (in + 3, bige_const); - in4 = VEC_LOAD_BE (in + 4, bige_const); - in5 = VEC_LOAD_BE (in + 5, bige_const); - in6 = VEC_LOAD_BE (in + 6, bige_const); - in7 = VEC_LOAD_BE (in + 7, bige_const); - - b0 = rkey0 ^ in0; - b1 = rkey0 ^ in1; - b2 = rkey0 ^ in2; - b3 = rkey0 ^ in3; - b4 = rkey0 ^ in4; - b5 = rkey0 ^ in5; - b6 = rkey0 ^ in6; - b7 = rkey0 ^ in7; + in0 = VEC_LOAD_BE_NOSWAP (in, 0); + in1 = VEC_LOAD_BE_NOSWAP (in, 1); + in2 = VEC_LOAD_BE_NOSWAP (in, 2); + in3 = VEC_LOAD_BE_NOSWAP (in, 3); + in0 = VEC_BE_SWAP (in0, bige_const); + in1 = VEC_BE_SWAP (in1, bige_const); + in4 = VEC_LOAD_BE_NOSWAP (in, 4); + in5 = VEC_LOAD_BE_NOSWAP (in, 5); + in2 = VEC_BE_SWAP (in2, bige_const); + in3 = VEC_BE_SWAP (in3, bige_const); + in6 = VEC_LOAD_BE_NOSWAP (in, 6); + in7 = VEC_LOAD_BE_NOSWAP (in, 7); + in += 8; + b0 = asm_xor (rkey0, in0); + b1 = asm_xor (rkey0, in1); + in4 = VEC_BE_SWAP (in4, bige_const); + in5 = VEC_BE_SWAP (in5, bige_const); + b2 = asm_xor (rkey0, in2); + b3 = asm_xor (rkey0, in3); + in6 = VEC_BE_SWAP (in6, bige_const); + in7 = VEC_BE_SWAP (in7, bige_const); + b4 = asm_xor (rkey0, in4); + b5 = asm_xor (rkey0, in5); + b6 = asm_xor (rkey0, in6); + b7 = asm_xor (rkey0, in7); #define DO_ROUND(r) \ - rkey = ALIGNED_LOAD(&rk[r]); \ - b0 = vec_ncipher_be (b0, rkey); \ - b1 = vec_ncipher_be (b1, rkey); \ - b2 = vec_ncipher_be (b2, rkey); \ - b3 = vec_ncipher_be (b3, rkey); \ - b4 = vec_ncipher_be (b4, rkey); \ - b5 = vec_ncipher_be (b5, rkey); \ - b6 = vec_ncipher_be (b6, rkey); \ - b7 = vec_ncipher_be (b7, rkey); + rkey = ALIGNED_LOAD (rk, r); \ + b0 = asm_ncipher_be (b0, rkey); \ + b1 = asm_ncipher_be (b1, rkey); \ + b2 = asm_ncipher_be (b2, rkey); \ + b3 = asm_ncipher_be (b3, rkey); \ + b4 = asm_ncipher_be (b4, rkey); \ + b5 = asm_ncipher_be (b5, rkey); \ + b6 = asm_ncipher_be (b6, rkey); \ + b7 = asm_ncipher_be (b7, rkey); DO_ROUND(1); DO_ROUND(2); @@ -757,48 +888,60 @@ void _gcry_aes_ppc8_cbc_dec (void *context, unsigned char *iv_arg, #undef DO_ROUND - rkey = rkeylast; - b0 = vec_ncipherlast_be (b0, rkey ^ iv); - b1 = vec_ncipherlast_be (b1, rkey ^ in0); - b2 = vec_ncipherlast_be (b2, rkey ^ in1); - b3 = vec_ncipherlast_be (b3, rkey ^ in2); - b4 = vec_ncipherlast_be (b4, rkey ^ in3); - b5 = vec_ncipherlast_be (b5, rkey ^ in4); - b6 = vec_ncipherlast_be (b6, rkey ^ in5); - b7 = vec_ncipherlast_be (b7, rkey ^ in6); + iv = asm_xor (rkeylast, iv); + in0 = asm_xor (rkeylast, in0); + in1 = asm_xor (rkeylast, in1); + in2 = asm_xor (rkeylast, in2); + b0 = asm_ncipherlast_be (b0, iv); iv = in7; - - VEC_STORE_BE (out + 0, b0, bige_const); - VEC_STORE_BE (out + 1, b1, bige_const); - VEC_STORE_BE (out + 2, b2, bige_const); - VEC_STORE_BE (out + 3, b3, bige_const); - VEC_STORE_BE (out + 4, b4, bige_const); - VEC_STORE_BE (out + 5, b5, bige_const); - VEC_STORE_BE (out + 6, b6, bige_const); - VEC_STORE_BE (out + 7, b7, bige_const); - - in += 8; + b1 = asm_ncipherlast_be (b1, in0); + in3 = asm_xor (rkeylast, in3); + in4 = asm_xor (rkeylast, in4); + b2 = asm_ncipherlast_be (b2, in1); + b3 = asm_ncipherlast_be (b3, in2); + in5 = asm_xor (rkeylast, in5); + in6 = asm_xor (rkeylast, in6); + b0 = VEC_BE_SWAP (b0, bige_const); + b1 = VEC_BE_SWAP (b1, bige_const); + b4 = asm_ncipherlast_be (b4, in3); + b5 = asm_ncipherlast_be (b5, in4); + b2 = VEC_BE_SWAP (b2, bige_const); + b3 = VEC_BE_SWAP (b3, bige_const); + b6 = asm_ncipherlast_be (b6, in5); + b7 = asm_ncipherlast_be (b7, in6); + b4 = VEC_BE_SWAP (b4, bige_const); + b5 = VEC_BE_SWAP (b5, bige_const); + b6 = VEC_BE_SWAP (b6, bige_const); + b7 = VEC_BE_SWAP (b7, bige_const); + VEC_STORE_BE_NOSWAP (out, 0, b0); + VEC_STORE_BE_NOSWAP (out, 1, b1); + VEC_STORE_BE_NOSWAP (out, 2, b2); + VEC_STORE_BE_NOSWAP (out, 3, b3); + VEC_STORE_BE_NOSWAP (out, 4, b4); + VEC_STORE_BE_NOSWAP (out, 5, b5); + VEC_STORE_BE_NOSWAP (out, 6, b6); + VEC_STORE_BE_NOSWAP (out, 7, b7); out += 8; } if (nblocks >= 4) { - in0 = VEC_LOAD_BE (in + 0, bige_const); - in1 = VEC_LOAD_BE (in + 1, bige_const); - in2 = VEC_LOAD_BE (in + 2, bige_const); - in3 = VEC_LOAD_BE (in + 3, bige_const); + in0 = VEC_LOAD_BE (in, 0, bige_const); + in1 = VEC_LOAD_BE (in, 1, bige_const); + in2 = VEC_LOAD_BE (in, 2, bige_const); + in3 = VEC_LOAD_BE (in, 3, bige_const); - b0 = rkey0 ^ in0; - b1 = rkey0 ^ in1; - b2 = rkey0 ^ in2; - b3 = rkey0 ^ in3; + b0 = asm_xor (rkey0, in0); + b1 = asm_xor (rkey0, in1); + b2 = asm_xor (rkey0, in2); + b3 = asm_xor (rkey0, in3); #define DO_ROUND(r) \ - rkey = ALIGNED_LOAD(&rk[r]); \ - b0 = vec_ncipher_be (b0, rkey); \ - b1 = vec_ncipher_be (b1, rkey); \ - b2 = vec_ncipher_be (b2, rkey); \ - b3 = vec_ncipher_be (b3, rkey); + rkey = ALIGNED_LOAD (rk, r); \ + b0 = asm_ncipher_be (b0, rkey); \ + b1 = asm_ncipher_be (b1, rkey); \ + b2 = asm_ncipher_be (b2, rkey); \ + b3 = asm_ncipher_be (b3, rkey); DO_ROUND(1); DO_ROUND(2); @@ -822,17 +965,21 @@ void _gcry_aes_ppc8_cbc_dec (void *context, unsigned char *iv_arg, #undef DO_ROUND - rkey = rkeylast; - b0 = vec_ncipherlast_be (b0, rkey ^ iv); - b1 = vec_ncipherlast_be (b1, rkey ^ in0); - b2 = vec_ncipherlast_be (b2, rkey ^ in1); - b3 = vec_ncipherlast_be (b3, rkey ^ in2); + iv = asm_xor (rkeylast, iv); + in0 = asm_xor (rkeylast, in0); + in1 = asm_xor (rkeylast, in1); + in2 = asm_xor (rkeylast, in2); + + b0 = asm_ncipherlast_be (b0, iv); iv = in3; + b1 = asm_ncipherlast_be (b1, in0); + b2 = asm_ncipherlast_be (b2, in1); + b3 = asm_ncipherlast_be (b3, in2); - VEC_STORE_BE (out + 0, b0, bige_const); - VEC_STORE_BE (out + 1, b1, bige_const); - VEC_STORE_BE (out + 2, b2, bige_const); - VEC_STORE_BE (out + 3, b3, bige_const); + VEC_STORE_BE (out, 0, b0, bige_const); + VEC_STORE_BE (out, 1, b1, bige_const); + VEC_STORE_BE (out, 2, b2, bige_const); + VEC_STORE_BE (out, 3, b3, bige_const); in += 4; out += 4; @@ -843,17 +990,17 @@ void _gcry_aes_ppc8_cbc_dec (void *context, unsigned char *iv_arg, { rkeylast = rkeylast_orig ^ iv; - iv = VEC_LOAD_BE (in, bige_const); + iv = VEC_LOAD_BE (in, 0, bige_const); b = iv; AES_DECRYPT (b, rounds); - VEC_STORE_BE (out, b, bige_const); + VEC_STORE_BE (out, 0, b, bige_const); in++; out++; } - VEC_STORE_BE (iv_arg, iv, bige_const); + VEC_STORE_BE (iv_arg, 0, iv, bige_const); } @@ -863,7 +1010,7 @@ void _gcry_aes_ppc8_ctr_enc (void *context, unsigned char *ctr_arg, { static const unsigned char vec_one_const[16] = { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1 }; - const block bige_const = vec_load_be_const(); + const block bige_const = asm_load_be_const(); RIJNDAEL_context *ctx = context; const u128_t *rk = (u128_t *)&ctx->keyschenc; const u128_t *in = (const u128_t *)inbuf_arg; @@ -873,56 +1020,80 @@ void _gcry_aes_ppc8_ctr_enc (void *context, unsigned char *ctr_arg, block rkeylast_orig; block ctr, b, one; - ctr = VEC_LOAD_BE (ctr_arg, bige_const); - one = VEC_LOAD_BE (&vec_one_const, bige_const); + ctr = VEC_LOAD_BE (ctr_arg, 0, bige_const); + one = VEC_LOAD_BE (&vec_one_const, 0, bige_const); PRELOAD_ROUND_KEYS (rounds); rkeylast_orig = rkeylast; if (nblocks >= 4) { + block in0, in1, in2, in3, in4, in5, in6, in7; block b0, b1, b2, b3, b4, b5, b6, b7; block two, three, four; - block ctr4; block rkey; - two = vec_add_uint128 (one, one); - three = vec_add_uint128 (two, one); - four = vec_add_uint128 (two, two); + two = asm_add_uint128 (one, one); + three = asm_add_uint128 (two, one); + four = asm_add_uint128 (two, two); for (; nblocks >= 8; nblocks -= 8) { - ctr4 = vec_add_uint128 (ctr, four); - b0 = rkey0 ^ ctr; - b1 = rkey0 ^ vec_add_uint128 (ctr, one); - b2 = rkey0 ^ vec_add_uint128 (ctr, two); - b3 = rkey0 ^ vec_add_uint128 (ctr, three); - b4 = rkey0 ^ ctr4; - b5 = rkey0 ^ vec_add_uint128 (ctr4, one); - b6 = rkey0 ^ vec_add_uint128 (ctr4, two); - b7 = rkey0 ^ vec_add_uint128 (ctr4, three); - ctr = vec_add_uint128 (ctr4, four); + b1 = asm_add_uint128 (ctr, one); + b2 = asm_add_uint128 (ctr, two); + b3 = asm_add_uint128 (ctr, three); + b4 = asm_add_uint128 (ctr, four); + b5 = asm_add_uint128 (b1, four); + b6 = asm_add_uint128 (b2, four); + b7 = asm_add_uint128 (b3, four); + b0 = asm_xor (rkey0, ctr); + rkey = ALIGNED_LOAD (rk, 1); + ctr = asm_add_uint128 (b4, four); + b1 = asm_xor (rkey0, b1); + b2 = asm_xor (rkey0, b2); + b3 = asm_xor (rkey0, b3); + b0 = asm_cipher_be (b0, rkey); + b1 = asm_cipher_be (b1, rkey); + b2 = asm_cipher_be (b2, rkey); + b3 = asm_cipher_be (b3, rkey); + b4 = asm_xor (rkey0, b4); + b5 = asm_xor (rkey0, b5); + b6 = asm_xor (rkey0, b6); + b7 = asm_xor (rkey0, b7); + b4 = asm_cipher_be (b4, rkey); + b5 = asm_cipher_be (b5, rkey); + b6 = asm_cipher_be (b6, rkey); + b7 = asm_cipher_be (b7, rkey); #define DO_ROUND(r) \ - rkey = ALIGNED_LOAD(&rk[r]); \ - b0 = vec_cipher_be (b0, rkey); \ - b1 = vec_cipher_be (b1, rkey); \ - b2 = vec_cipher_be (b2, rkey); \ - b3 = vec_cipher_be (b3, rkey); \ - b4 = vec_cipher_be (b4, rkey); \ - b5 = vec_cipher_be (b5, rkey); \ - b6 = vec_cipher_be (b6, rkey); \ - b7 = vec_cipher_be (b7, rkey); - - DO_ROUND(1); + rkey = ALIGNED_LOAD (rk, r); \ + b0 = asm_cipher_be (b0, rkey); \ + b1 = asm_cipher_be (b1, rkey); \ + b2 = asm_cipher_be (b2, rkey); \ + b3 = asm_cipher_be (b3, rkey); \ + b4 = asm_cipher_be (b4, rkey); \ + b5 = asm_cipher_be (b5, rkey); \ + b6 = asm_cipher_be (b6, rkey); \ + b7 = asm_cipher_be (b7, rkey); + + in0 = VEC_LOAD_BE_NOSWAP (in, 0); DO_ROUND(2); + in1 = VEC_LOAD_BE_NOSWAP (in, 1); DO_ROUND(3); + in2 = VEC_LOAD_BE_NOSWAP (in, 2); DO_ROUND(4); + in3 = VEC_LOAD_BE_NOSWAP (in, 3); DO_ROUND(5); + in4 = VEC_LOAD_BE_NOSWAP (in, 4); DO_ROUND(6); + in5 = VEC_LOAD_BE_NOSWAP (in, 5); DO_ROUND(7); + in6 = VEC_LOAD_BE_NOSWAP (in, 6); DO_ROUND(8); + in7 = VEC_LOAD_BE_NOSWAP (in, 7); + in += 8; DO_ROUND(9); + if (rounds >= 12) { DO_ROUND(10); @@ -936,43 +1107,68 @@ void _gcry_aes_ppc8_ctr_enc (void *context, unsigned char *ctr_arg, #undef DO_ROUND - rkey = rkeylast; - b0 = vec_cipherlast_be (b0, rkey ^ VEC_LOAD_BE (in + 0, bige_const)); - b1 = vec_cipherlast_be (b1, rkey ^ VEC_LOAD_BE (in + 1, bige_const)); - b2 = vec_cipherlast_be (b2, rkey ^ VEC_LOAD_BE (in + 2, bige_const)); - b3 = vec_cipherlast_be (b3, rkey ^ VEC_LOAD_BE (in + 3, bige_const)); - b4 = vec_cipherlast_be (b4, rkey ^ VEC_LOAD_BE (in + 4, bige_const)); - b5 = vec_cipherlast_be (b5, rkey ^ VEC_LOAD_BE (in + 5, bige_const)); - b6 = vec_cipherlast_be (b6, rkey ^ VEC_LOAD_BE (in + 6, bige_const)); - b7 = vec_cipherlast_be (b7, rkey ^ VEC_LOAD_BE (in + 7, bige_const)); - - VEC_STORE_BE (out + 0, b0, bige_const); - VEC_STORE_BE (out + 1, b1, bige_const); - VEC_STORE_BE (out + 2, b2, bige_const); - VEC_STORE_BE (out + 3, b3, bige_const); - VEC_STORE_BE (out + 4, b4, bige_const); - VEC_STORE_BE (out + 5, b5, bige_const); - VEC_STORE_BE (out + 6, b6, bige_const); - VEC_STORE_BE (out + 7, b7, bige_const); - - in += 8; + in0 = VEC_BE_SWAP (in0, bige_const); + in1 = VEC_BE_SWAP (in1, bige_const); + in2 = VEC_BE_SWAP (in2, bige_const); + in3 = VEC_BE_SWAP (in3, bige_const); + in4 = VEC_BE_SWAP (in4, bige_const); + in5 = VEC_BE_SWAP (in5, bige_const); + in6 = VEC_BE_SWAP (in6, bige_const); + in7 = VEC_BE_SWAP (in7, bige_const); + + in0 = asm_xor (rkeylast, in0); + in1 = asm_xor (rkeylast, in1); + in2 = asm_xor (rkeylast, in2); + in3 = asm_xor (rkeylast, in3); + b0 = asm_cipherlast_be (b0, in0); + b1 = asm_cipherlast_be (b1, in1); + in4 = asm_xor (rkeylast, in4); + in5 = asm_xor (rkeylast, in5); + b2 = asm_cipherlast_be (b2, in2); + b3 = asm_cipherlast_be (b3, in3); + in6 = asm_xor (rkeylast, in6); + in7 = asm_xor (rkeylast, in7); + b4 = asm_cipherlast_be (b4, in4); + b5 = asm_cipherlast_be (b5, in5); + b6 = asm_cipherlast_be (b6, in6); + b7 = asm_cipherlast_be (b7, in7); + + b0 = VEC_BE_SWAP (b0, bige_const); + b1 = VEC_BE_SWAP (b1, bige_const); + b2 = VEC_BE_SWAP (b2, bige_const); + b3 = VEC_BE_SWAP (b3, bige_const); + b4 = VEC_BE_SWAP (b4, bige_const); + b5 = VEC_BE_SWAP (b5, bige_const); + b6 = VEC_BE_SWAP (b6, bige_const); + b7 = VEC_BE_SWAP (b7, bige_const); + VEC_STORE_BE_NOSWAP (out, 0, b0); + VEC_STORE_BE_NOSWAP (out, 1, b1); + VEC_STORE_BE_NOSWAP (out, 2, b2); + VEC_STORE_BE_NOSWAP (out, 3, b3); + VEC_STORE_BE_NOSWAP (out, 4, b4); + VEC_STORE_BE_NOSWAP (out, 5, b5); + VEC_STORE_BE_NOSWAP (out, 6, b6); + VEC_STORE_BE_NOSWAP (out, 7, b7); out += 8; } if (nblocks >= 4) { - b0 = rkey0 ^ ctr; - b1 = rkey0 ^ vec_add_uint128 (ctr, one); - b2 = rkey0 ^ vec_add_uint128 (ctr, two); - b3 = rkey0 ^ vec_add_uint128 (ctr, three); - ctr = vec_add_uint128 (ctr, four); + b1 = asm_add_uint128 (ctr, one); + b2 = asm_add_uint128 (ctr, two); + b3 = asm_add_uint128 (ctr, three); + b0 = asm_xor (rkey0, ctr); + ctr = asm_add_uint128 (ctr, four); + b1 = asm_xor (rkey0, b1); + b2 = asm_xor (rkey0, b2); + b3 = asm_xor (rkey0, b3); #define DO_ROUND(r) \ - rkey = ALIGNED_LOAD(&rk[r]); \ - b0 = vec_cipher_be (b0, rkey); \ - b1 = vec_cipher_be (b1, rkey); \ - b2 = vec_cipher_be (b2, rkey); \ - b3 = vec_cipher_be (b3, rkey); + rkey = ALIGNED_LOAD (rk, r); \ + b0 = asm_cipher_be (b0, rkey); \ + b1 = asm_cipher_be (b1, rkey); \ + b2 = asm_cipher_be (b2, rkey); \ + b3 = asm_cipher_be (b3, rkey); DO_ROUND(1); DO_ROUND(2); @@ -982,6 +1178,12 @@ void _gcry_aes_ppc8_ctr_enc (void *context, unsigned char *ctr_arg, DO_ROUND(6); DO_ROUND(7); DO_ROUND(8); + + in0 = VEC_LOAD_BE (in, 0, bige_const); + in1 = VEC_LOAD_BE (in, 1, bige_const); + in2 = VEC_LOAD_BE (in, 2, bige_const); + in3 = VEC_LOAD_BE (in, 3, bige_const); + DO_ROUND(9); if (rounds >= 12) { @@ -996,16 +1198,21 @@ void _gcry_aes_ppc8_ctr_enc (void *context, unsigned char *ctr_arg, #undef DO_ROUND - rkey = rkeylast; - b0 = vec_cipherlast_be (b0, rkey ^ VEC_LOAD_BE (in + 0, bige_const)); - b1 = vec_cipherlast_be (b1, rkey ^ VEC_LOAD_BE (in + 1, bige_const)); - b2 = vec_cipherlast_be (b2, rkey ^ VEC_LOAD_BE (in + 2, bige_const)); - b3 = vec_cipherlast_be (b3, rkey ^ VEC_LOAD_BE (in + 3, bige_const)); - - VEC_STORE_BE (out + 0, b0, bige_const); - VEC_STORE_BE (out + 1, b1, bige_const); - VEC_STORE_BE (out + 2, b2, bige_const); - VEC_STORE_BE (out + 3, b3, bige_const); + in0 = asm_xor (rkeylast, in0); + in1 = asm_xor (rkeylast, in1); + in2 = asm_xor (rkeylast, in2); + in3 = asm_xor (rkeylast, in3); + + b0 = asm_cipherlast_be (b0, in0); + b1 = asm_cipherlast_be (b1, in1); + b2 = asm_cipherlast_be (b2, in2); + b3 = asm_cipherlast_be (b3, in3); + + VEC_STORE_BE (out, 0, b0, bige_const); + VEC_STORE_BE (out, 1, b1, bige_const); + VEC_STORE_BE (out, 2, b2, bige_const); + VEC_STORE_BE (out, 3, b3, bige_const); + in += 4; out += 4; nblocks -= 4; @@ -1015,18 +1222,18 @@ void _gcry_aes_ppc8_ctr_enc (void *context, unsigned char *ctr_arg, for (; nblocks; nblocks--) { b = ctr; - ctr = vec_add_uint128 (ctr, one); - rkeylast = rkeylast_orig ^ VEC_LOAD_BE (in, bige_const); + ctr = asm_add_uint128 (ctr, one); + rkeylast = rkeylast_orig ^ VEC_LOAD_BE (in, 0, bige_const); AES_ENCRYPT (b, rounds); - VEC_STORE_BE (out, b, bige_const); + VEC_STORE_BE (out, 0, b, bige_const); out++; in++; } - VEC_STORE_BE (ctr_arg, ctr, bige_const); + VEC_STORE_BE (ctr_arg, 0, ctr, bige_const); } @@ -1034,7 +1241,7 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, const void *inbuf_arg, size_t nblocks, int encrypt) { - const block bige_const = vec_load_be_const(); + const block bige_const = asm_load_be_const(); RIJNDAEL_context *ctx = (void *)&c->context.c; const u128_t *in = (const u128_t *)inbuf_arg; u128_t *out = (u128_t *)outbuf_arg; @@ -1043,16 +1250,16 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, block l0, l1, l2, l; block b0, b1, b2, b3, b4, b5, b6, b7, b; block iv0, iv1, iv2, iv3, iv4, iv5, iv6, iv7; - block rkey; + block rkey, rkeylf; block ctr, iv; ROUND_KEY_VARIABLES; - iv = VEC_LOAD_BE (c->u_iv.iv, bige_const); - ctr = VEC_LOAD_BE (c->u_ctr.ctr, bige_const); + iv = VEC_LOAD_BE (c->u_iv.iv, 0, bige_const); + ctr = VEC_LOAD_BE (c->u_ctr.ctr, 0, bige_const); - l0 = VEC_LOAD_BE (c->u_mode.ocb.L[0], bige_const); - l1 = VEC_LOAD_BE (c->u_mode.ocb.L[1], bige_const); - l2 = VEC_LOAD_BE (c->u_mode.ocb.L[2], bige_const); + l0 = VEC_LOAD_BE (c->u_mode.ocb.L[0], 0, bige_const); + l1 = VEC_LOAD_BE (c->u_mode.ocb.L[1], 0, bige_const); + l2 = VEC_LOAD_BE (c->u_mode.ocb.L[2], 0, bige_const); if (encrypt) { @@ -1062,8 +1269,8 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, for (; nblocks >= 8 && data_nblocks % 8; nblocks--) { - l = VEC_LOAD_BE (ocb_get_l (c, ++data_nblocks), bige_const); - b = VEC_LOAD_BE (in, bige_const); + l = VEC_LOAD_BE (ocb_get_l (c, ++data_nblocks), 0, bige_const); + b = VEC_LOAD_BE (in, 0, bige_const); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ iv ^= l; @@ -1074,7 +1281,7 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, AES_ENCRYPT (b, rounds); b ^= iv; - VEC_STORE_BE (out, b, bige_const); + VEC_STORE_BE (out, 0, b, bige_const); in += 1; out += 1; @@ -1082,16 +1289,25 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, for (; nblocks >= 8; nblocks -= 8) { - b0 = VEC_LOAD_BE (in + 0, bige_const); - b1 = VEC_LOAD_BE (in + 1, bige_const); - b2 = VEC_LOAD_BE (in + 2, bige_const); - b3 = VEC_LOAD_BE (in + 3, bige_const); - b4 = VEC_LOAD_BE (in + 4, bige_const); - b5 = VEC_LOAD_BE (in + 5, bige_const); - b6 = VEC_LOAD_BE (in + 6, bige_const); - b7 = VEC_LOAD_BE (in + 7, bige_const); - - l = VEC_LOAD_BE (ocb_get_l (c, data_nblocks += 8), bige_const); + b0 = VEC_LOAD_BE_NOSWAP (in, 0); + b1 = VEC_LOAD_BE_NOSWAP (in, 1); + b2 = VEC_LOAD_BE_NOSWAP (in, 2); + b3 = VEC_LOAD_BE_NOSWAP (in, 3); + b4 = VEC_LOAD_BE_NOSWAP (in, 4); + b5 = VEC_LOAD_BE_NOSWAP (in, 5); + b6 = VEC_LOAD_BE_NOSWAP (in, 6); + b7 = VEC_LOAD_BE_NOSWAP (in, 7); + in += 8; + l = VEC_LOAD_BE_NOSWAP (ocb_get_l (c, data_nblocks += 8), 0); + b0 = VEC_BE_SWAP(b0, bige_const); + b1 = VEC_BE_SWAP(b1, bige_const); + b2 = VEC_BE_SWAP(b2, bige_const); + b3 = VEC_BE_SWAP(b3, bige_const); + b4 = VEC_BE_SWAP(b4, bige_const); + b5 = VEC_BE_SWAP(b5, bige_const); + b6 = VEC_BE_SWAP(b6, bige_const); + b7 = VEC_BE_SWAP(b7, bige_const); + l = VEC_BE_SWAP(l, bige_const); ctr ^= b0 ^ b1 ^ b2 ^ b3 ^ b4 ^ b5 ^ b6 ^ b7; @@ -1117,15 +1333,15 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, iv = iv7 ^ rkey0; #define DO_ROUND(r) \ - rkey = ALIGNED_LOAD (&rk[r]); \ - b0 = vec_cipher_be (b0, rkey); \ - b1 = vec_cipher_be (b1, rkey); \ - b2 = vec_cipher_be (b2, rkey); \ - b3 = vec_cipher_be (b3, rkey); \ - b4 = vec_cipher_be (b4, rkey); \ - b5 = vec_cipher_be (b5, rkey); \ - b6 = vec_cipher_be (b6, rkey); \ - b7 = vec_cipher_be (b7, rkey); + rkey = ALIGNED_LOAD (rk, r); \ + b0 = asm_cipher_be (b0, rkey); \ + b1 = asm_cipher_be (b1, rkey); \ + b2 = asm_cipher_be (b2, rkey); \ + b3 = asm_cipher_be (b3, rkey); \ + b4 = asm_cipher_be (b4, rkey); \ + b5 = asm_cipher_be (b5, rkey); \ + b6 = asm_cipher_be (b6, rkey); \ + b7 = asm_cipher_be (b7, rkey); DO_ROUND(1); DO_ROUND(2); @@ -1134,7 +1350,20 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, DO_ROUND(5); DO_ROUND(6); DO_ROUND(7); + + rkeylf = asm_xor (rkeylast, rkey0); + DO_ROUND(8); + + iv0 = asm_xor (rkeylf, iv0); + iv1 = asm_xor (rkeylf, iv1); + iv2 = asm_xor (rkeylf, iv2); + iv3 = asm_xor (rkeylf, iv3); + iv4 = asm_xor (rkeylf, iv4); + iv5 = asm_xor (rkeylf, iv5); + iv6 = asm_xor (rkeylf, iv6); + iv7 = asm_xor (rkeylf, iv7); + DO_ROUND(9); if (rounds >= 12) { @@ -1149,37 +1378,42 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, #undef DO_ROUND - rkey = rkeylast ^ rkey0; - b0 = vec_cipherlast_be (b0, rkey ^ iv0); - b1 = vec_cipherlast_be (b1, rkey ^ iv1); - b2 = vec_cipherlast_be (b2, rkey ^ iv2); - b3 = vec_cipherlast_be (b3, rkey ^ iv3); - b4 = vec_cipherlast_be (b4, rkey ^ iv4); - b5 = vec_cipherlast_be (b5, rkey ^ iv5); - b6 = vec_cipherlast_be (b6, rkey ^ iv6); - b7 = vec_cipherlast_be (b7, rkey ^ iv7); - - VEC_STORE_BE (out + 0, b0, bige_const); - VEC_STORE_BE (out + 1, b1, bige_const); - VEC_STORE_BE (out + 2, b2, bige_const); - VEC_STORE_BE (out + 3, b3, bige_const); - VEC_STORE_BE (out + 4, b4, bige_const); - VEC_STORE_BE (out + 5, b5, bige_const); - VEC_STORE_BE (out + 6, b6, bige_const); - VEC_STORE_BE (out + 7, b7, bige_const); - - in += 8; + b0 = asm_cipherlast_be (b0, iv0); + b1 = asm_cipherlast_be (b1, iv1); + b2 = asm_cipherlast_be (b2, iv2); + b3 = asm_cipherlast_be (b3, iv3); + b4 = asm_cipherlast_be (b4, iv4); + b5 = asm_cipherlast_be (b5, iv5); + b6 = asm_cipherlast_be (b6, iv6); + b7 = asm_cipherlast_be (b7, iv7); + + b0 = VEC_BE_SWAP (b0, bige_const); + b1 = VEC_BE_SWAP (b1, bige_const); + b2 = VEC_BE_SWAP (b2, bige_const); + b3 = VEC_BE_SWAP (b3, bige_const); + b4 = VEC_BE_SWAP (b4, bige_const); + b5 = VEC_BE_SWAP (b5, bige_const); + b6 = VEC_BE_SWAP (b6, bige_const); + b7 = VEC_BE_SWAP (b7, bige_const); + VEC_STORE_BE_NOSWAP (out, 0, b0); + VEC_STORE_BE_NOSWAP (out, 1, b1); + VEC_STORE_BE_NOSWAP (out, 2, b2); + VEC_STORE_BE_NOSWAP (out, 3, b3); + VEC_STORE_BE_NOSWAP (out, 4, b4); + VEC_STORE_BE_NOSWAP (out, 5, b5); + VEC_STORE_BE_NOSWAP (out, 6, b6); + VEC_STORE_BE_NOSWAP (out, 7, b7); out += 8; } if (nblocks >= 4 && (data_nblocks % 4) == 0) { - b0 = VEC_LOAD_BE (in + 0, bige_const); - b1 = VEC_LOAD_BE (in + 1, bige_const); - b2 = VEC_LOAD_BE (in + 2, bige_const); - b3 = VEC_LOAD_BE (in + 3, bige_const); + b0 = VEC_LOAD_BE (in, 0, bige_const); + b1 = VEC_LOAD_BE (in, 1, bige_const); + b2 = VEC_LOAD_BE (in, 2, bige_const); + b3 = VEC_LOAD_BE (in, 3, bige_const); - l = VEC_LOAD_BE (ocb_get_l (c, data_nblocks += 4), bige_const); + l = VEC_LOAD_BE (ocb_get_l (c, data_nblocks += 4), 0, bige_const); ctr ^= b0 ^ b1 ^ b2 ^ b3; @@ -1197,11 +1431,11 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, iv = iv3 ^ rkey0; #define DO_ROUND(r) \ - rkey = ALIGNED_LOAD (&rk[r]); \ - b0 = vec_cipher_be (b0, rkey); \ - b1 = vec_cipher_be (b1, rkey); \ - b2 = vec_cipher_be (b2, rkey); \ - b3 = vec_cipher_be (b3, rkey); + rkey = ALIGNED_LOAD (rk, r); \ + b0 = asm_cipher_be (b0, rkey); \ + b1 = asm_cipher_be (b1, rkey); \ + b2 = asm_cipher_be (b2, rkey); \ + b3 = asm_cipher_be (b3, rkey); DO_ROUND(1); DO_ROUND(2); @@ -1226,15 +1460,15 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, #undef DO_ROUND rkey = rkeylast ^ rkey0; - b0 = vec_cipherlast_be (b0, rkey ^ iv0); - b1 = vec_cipherlast_be (b1, rkey ^ iv1); - b2 = vec_cipherlast_be (b2, rkey ^ iv2); - b3 = vec_cipherlast_be (b3, rkey ^ iv3); + b0 = asm_cipherlast_be (b0, rkey ^ iv0); + b1 = asm_cipherlast_be (b1, rkey ^ iv1); + b2 = asm_cipherlast_be (b2, rkey ^ iv2); + b3 = asm_cipherlast_be (b3, rkey ^ iv3); - VEC_STORE_BE (out + 0, b0, bige_const); - VEC_STORE_BE (out + 1, b1, bige_const); - VEC_STORE_BE (out + 2, b2, bige_const); - VEC_STORE_BE (out + 3, b3, bige_const); + VEC_STORE_BE (out, 0, b0, bige_const); + VEC_STORE_BE (out, 1, b1, bige_const); + VEC_STORE_BE (out, 2, b2, bige_const); + VEC_STORE_BE (out, 3, b3, bige_const); in += 4; out += 4; @@ -1243,8 +1477,8 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, for (; nblocks; nblocks--) { - l = VEC_LOAD_BE (ocb_get_l (c, ++data_nblocks), bige_const); - b = VEC_LOAD_BE (in, bige_const); + l = VEC_LOAD_BE (ocb_get_l (c, ++data_nblocks), 0, bige_const); + b = VEC_LOAD_BE (in, 0, bige_const); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ iv ^= l; @@ -1255,7 +1489,7 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, AES_ENCRYPT (b, rounds); b ^= iv; - VEC_STORE_BE (out, b, bige_const); + VEC_STORE_BE (out, 0, b, bige_const); in += 1; out += 1; @@ -1275,8 +1509,8 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, for (; nblocks >= 8 && data_nblocks % 8; nblocks--) { - l = VEC_LOAD_BE (ocb_get_l (c, ++data_nblocks), bige_const); - b = VEC_LOAD_BE (in, bige_const); + l = VEC_LOAD_BE (ocb_get_l (c, ++data_nblocks), 0, bige_const); + b = VEC_LOAD_BE (in, 0, bige_const); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ iv ^= l; @@ -1287,7 +1521,7 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, /* Checksum_i = Checksum_{i-1} xor P_i */ ctr ^= b; - VEC_STORE_BE (out, b, bige_const); + VEC_STORE_BE (out, 0, b, bige_const); in += 1; out += 1; @@ -1295,16 +1529,25 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, for (; nblocks >= 8; nblocks -= 8) { - b0 = VEC_LOAD_BE (in + 0, bige_const); - b1 = VEC_LOAD_BE (in + 1, bige_const); - b2 = VEC_LOAD_BE (in + 2, bige_const); - b3 = VEC_LOAD_BE (in + 3, bige_const); - b4 = VEC_LOAD_BE (in + 4, bige_const); - b5 = VEC_LOAD_BE (in + 5, bige_const); - b6 = VEC_LOAD_BE (in + 6, bige_const); - b7 = VEC_LOAD_BE (in + 7, bige_const); - - l = VEC_LOAD_BE (ocb_get_l (c, data_nblocks += 8), bige_const); + b0 = VEC_LOAD_BE_NOSWAP (in, 0); + b1 = VEC_LOAD_BE_NOSWAP (in, 1); + b2 = VEC_LOAD_BE_NOSWAP (in, 2); + b3 = VEC_LOAD_BE_NOSWAP (in, 3); + b4 = VEC_LOAD_BE_NOSWAP (in, 4); + b5 = VEC_LOAD_BE_NOSWAP (in, 5); + b6 = VEC_LOAD_BE_NOSWAP (in, 6); + b7 = VEC_LOAD_BE_NOSWAP (in, 7); + in += 8; + l = VEC_LOAD_BE_NOSWAP (ocb_get_l (c, data_nblocks += 8), 0); + b0 = VEC_BE_SWAP(b0, bige_const); + b1 = VEC_BE_SWAP(b1, bige_const); + b2 = VEC_BE_SWAP(b2, bige_const); + b3 = VEC_BE_SWAP(b3, bige_const); + b4 = VEC_BE_SWAP(b4, bige_const); + b5 = VEC_BE_SWAP(b5, bige_const); + b6 = VEC_BE_SWAP(b6, bige_const); + b7 = VEC_BE_SWAP(b7, bige_const); + l = VEC_BE_SWAP(l, bige_const); iv ^= rkey0; @@ -1328,15 +1571,15 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, iv = iv7 ^ rkey0; #define DO_ROUND(r) \ - rkey = ALIGNED_LOAD (&rk[r]); \ - b0 = vec_ncipher_be (b0, rkey); \ - b1 = vec_ncipher_be (b1, rkey); \ - b2 = vec_ncipher_be (b2, rkey); \ - b3 = vec_ncipher_be (b3, rkey); \ - b4 = vec_ncipher_be (b4, rkey); \ - b5 = vec_ncipher_be (b5, rkey); \ - b6 = vec_ncipher_be (b6, rkey); \ - b7 = vec_ncipher_be (b7, rkey); + rkey = ALIGNED_LOAD (rk, r); \ + b0 = asm_ncipher_be (b0, rkey); \ + b1 = asm_ncipher_be (b1, rkey); \ + b2 = asm_ncipher_be (b2, rkey); \ + b3 = asm_ncipher_be (b3, rkey); \ + b4 = asm_ncipher_be (b4, rkey); \ + b5 = asm_ncipher_be (b5, rkey); \ + b6 = asm_ncipher_be (b6, rkey); \ + b7 = asm_ncipher_be (b7, rkey); DO_ROUND(1); DO_ROUND(2); @@ -1345,7 +1588,20 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, DO_ROUND(5); DO_ROUND(6); DO_ROUND(7); + + rkeylf = asm_xor (rkeylast, rkey0); + DO_ROUND(8); + + iv0 = asm_xor (rkeylf, iv0); + iv1 = asm_xor (rkeylf, iv1); + iv2 = asm_xor (rkeylf, iv2); + iv3 = asm_xor (rkeylf, iv3); + iv4 = asm_xor (rkeylf, iv4); + iv5 = asm_xor (rkeylf, iv5); + iv6 = asm_xor (rkeylf, iv6); + iv7 = asm_xor (rkeylf, iv7); + DO_ROUND(9); if (rounds >= 12) { @@ -1360,39 +1616,44 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, #undef DO_ROUND - rkey = rkeylast ^ rkey0; - b0 = vec_ncipherlast_be (b0, rkey ^ iv0); - b1 = vec_ncipherlast_be (b1, rkey ^ iv1); - b2 = vec_ncipherlast_be (b2, rkey ^ iv2); - b3 = vec_ncipherlast_be (b3, rkey ^ iv3); - b4 = vec_ncipherlast_be (b4, rkey ^ iv4); - b5 = vec_ncipherlast_be (b5, rkey ^ iv5); - b6 = vec_ncipherlast_be (b6, rkey ^ iv6); - b7 = vec_ncipherlast_be (b7, rkey ^ iv7); - - VEC_STORE_BE (out + 0, b0, bige_const); - VEC_STORE_BE (out + 1, b1, bige_const); - VEC_STORE_BE (out + 2, b2, bige_const); - VEC_STORE_BE (out + 3, b3, bige_const); - VEC_STORE_BE (out + 4, b4, bige_const); - VEC_STORE_BE (out + 5, b5, bige_const); - VEC_STORE_BE (out + 6, b6, bige_const); - VEC_STORE_BE (out + 7, b7, bige_const); + b0 = asm_ncipherlast_be (b0, iv0); + b1 = asm_ncipherlast_be (b1, iv1); + b2 = asm_ncipherlast_be (b2, iv2); + b3 = asm_ncipherlast_be (b3, iv3); + b4 = asm_ncipherlast_be (b4, iv4); + b5 = asm_ncipherlast_be (b5, iv5); + b6 = asm_ncipherlast_be (b6, iv6); + b7 = asm_ncipherlast_be (b7, iv7); ctr ^= b0 ^ b1 ^ b2 ^ b3 ^ b4 ^ b5 ^ b6 ^ b7; - in += 8; + b0 = VEC_BE_SWAP (b0, bige_const); + b1 = VEC_BE_SWAP (b1, bige_const); + b2 = VEC_BE_SWAP (b2, bige_const); + b3 = VEC_BE_SWAP (b3, bige_const); + b4 = VEC_BE_SWAP (b4, bige_const); + b5 = VEC_BE_SWAP (b5, bige_const); + b6 = VEC_BE_SWAP (b6, bige_const); + b7 = VEC_BE_SWAP (b7, bige_const); + VEC_STORE_BE_NOSWAP (out, 0, b0); + VEC_STORE_BE_NOSWAP (out, 1, b1); + VEC_STORE_BE_NOSWAP (out, 2, b2); + VEC_STORE_BE_NOSWAP (out, 3, b3); + VEC_STORE_BE_NOSWAP (out, 4, b4); + VEC_STORE_BE_NOSWAP (out, 5, b5); + VEC_STORE_BE_NOSWAP (out, 6, b6); + VEC_STORE_BE_NOSWAP (out, 7, b7); out += 8; } if (nblocks >= 4 && (data_nblocks % 4) == 0) { - b0 = VEC_LOAD_BE (in + 0, bige_const); - b1 = VEC_LOAD_BE (in + 1, bige_const); - b2 = VEC_LOAD_BE (in + 2, bige_const); - b3 = VEC_LOAD_BE (in + 3, bige_const); + b0 = VEC_LOAD_BE (in, 0, bige_const); + b1 = VEC_LOAD_BE (in, 1, bige_const); + b2 = VEC_LOAD_BE (in, 2, bige_const); + b3 = VEC_LOAD_BE (in, 3, bige_const); - l = VEC_LOAD_BE (ocb_get_l (c, data_nblocks += 4), bige_const); + l = VEC_LOAD_BE (ocb_get_l (c, data_nblocks += 4), 0, bige_const); iv ^= rkey0; @@ -1408,11 +1669,11 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, iv = iv3 ^ rkey0; #define DO_ROUND(r) \ - rkey = ALIGNED_LOAD (&rk[r]); \ - b0 = vec_ncipher_be (b0, rkey); \ - b1 = vec_ncipher_be (b1, rkey); \ - b2 = vec_ncipher_be (b2, rkey); \ - b3 = vec_ncipher_be (b3, rkey); + rkey = ALIGNED_LOAD (rk, r); \ + b0 = asm_ncipher_be (b0, rkey); \ + b1 = asm_ncipher_be (b1, rkey); \ + b2 = asm_ncipher_be (b2, rkey); \ + b3 = asm_ncipher_be (b3, rkey); DO_ROUND(1); DO_ROUND(2); @@ -1437,15 +1698,15 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, #undef DO_ROUND rkey = rkeylast ^ rkey0; - b0 = vec_ncipherlast_be (b0, rkey ^ iv0); - b1 = vec_ncipherlast_be (b1, rkey ^ iv1); - b2 = vec_ncipherlast_be (b2, rkey ^ iv2); - b3 = vec_ncipherlast_be (b3, rkey ^ iv3); + b0 = asm_ncipherlast_be (b0, rkey ^ iv0); + b1 = asm_ncipherlast_be (b1, rkey ^ iv1); + b2 = asm_ncipherlast_be (b2, rkey ^ iv2); + b3 = asm_ncipherlast_be (b3, rkey ^ iv3); - VEC_STORE_BE (out + 0, b0, bige_const); - VEC_STORE_BE (out + 1, b1, bige_const); - VEC_STORE_BE (out + 2, b2, bige_const); - VEC_STORE_BE (out + 3, b3, bige_const); + VEC_STORE_BE (out, 0, b0, bige_const); + VEC_STORE_BE (out, 1, b1, bige_const); + VEC_STORE_BE (out, 2, b2, bige_const); + VEC_STORE_BE (out, 3, b3, bige_const); ctr ^= b0 ^ b1 ^ b2 ^ b3; @@ -1456,8 +1717,8 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, for (; nblocks; nblocks--) { - l = VEC_LOAD_BE (ocb_get_l (c, ++data_nblocks), bige_const); - b = VEC_LOAD_BE (in, bige_const); + l = VEC_LOAD_BE (ocb_get_l (c, ++data_nblocks), 0, bige_const); + b = VEC_LOAD_BE (in, 0, bige_const); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ iv ^= l; @@ -1468,15 +1729,15 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, /* Checksum_i = Checksum_{i-1} xor P_i */ ctr ^= b; - VEC_STORE_BE (out, b, bige_const); + VEC_STORE_BE (out, 0, b, bige_const); in += 1; out += 1; } } - VEC_STORE_BE (c->u_iv.iv, iv, bige_const); - VEC_STORE_BE (c->u_ctr.ctr, ctr, bige_const); + VEC_STORE_BE (c->u_iv.iv, 0, iv, bige_const); + VEC_STORE_BE (c->u_ctr.ctr, 0, ctr, bige_const); c->u_mode.ocb.data_nblocks = data_nblocks; return 0; @@ -1485,7 +1746,7 @@ size_t _gcry_aes_ppc8_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, size_t _gcry_aes_ppc8_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, size_t nblocks) { - const block bige_const = vec_load_be_const(); + const block bige_const = asm_load_be_const(); RIJNDAEL_context *ctx = (void *)&c->context.c; const u128_t *rk = (u128_t *)&ctx->keyschenc; const u128_t *abuf = (const u128_t *)abuf_arg; @@ -1498,19 +1759,19 @@ size_t _gcry_aes_ppc8_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, block ctr, iv; ROUND_KEY_VARIABLES; - iv = VEC_LOAD_BE (c->u_mode.ocb.aad_offset, bige_const); - ctr = VEC_LOAD_BE (c->u_mode.ocb.aad_sum, bige_const); + iv = VEC_LOAD_BE (c->u_mode.ocb.aad_offset, 0, bige_const); + ctr = VEC_LOAD_BE (c->u_mode.ocb.aad_sum, 0, bige_const); - l0 = VEC_LOAD_BE (c->u_mode.ocb.L[0], bige_const); - l1 = VEC_LOAD_BE (c->u_mode.ocb.L[1], bige_const); - l2 = VEC_LOAD_BE (c->u_mode.ocb.L[2], bige_const); + l0 = VEC_LOAD_BE (c->u_mode.ocb.L[0], 0, bige_const); + l1 = VEC_LOAD_BE (c->u_mode.ocb.L[1], 0, bige_const); + l2 = VEC_LOAD_BE (c->u_mode.ocb.L[2], 0, bige_const); PRELOAD_ROUND_KEYS (rounds); for (; nblocks >= 8 && data_nblocks % 8; nblocks--) { - l = VEC_LOAD_BE (ocb_get_l (c, ++data_nblocks), bige_const); - b = VEC_LOAD_BE (abuf, bige_const); + l = VEC_LOAD_BE (ocb_get_l (c, ++data_nblocks), 0, bige_const); + b = VEC_LOAD_BE (abuf, 0, bige_const); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ iv ^= l; @@ -1524,16 +1785,16 @@ size_t _gcry_aes_ppc8_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, for (; nblocks >= 8; nblocks -= 8) { - b0 = VEC_LOAD_BE (abuf + 0, bige_const); - b1 = VEC_LOAD_BE (abuf + 1, bige_const); - b2 = VEC_LOAD_BE (abuf + 2, bige_const); - b3 = VEC_LOAD_BE (abuf + 3, bige_const); - b4 = VEC_LOAD_BE (abuf + 4, bige_const); - b5 = VEC_LOAD_BE (abuf + 5, bige_const); - b6 = VEC_LOAD_BE (abuf + 6, bige_const); - b7 = VEC_LOAD_BE (abuf + 7, bige_const); + b0 = VEC_LOAD_BE (abuf, 0, bige_const); + b1 = VEC_LOAD_BE (abuf, 1, bige_const); + b2 = VEC_LOAD_BE (abuf, 2, bige_const); + b3 = VEC_LOAD_BE (abuf, 3, bige_const); + b4 = VEC_LOAD_BE (abuf, 4, bige_const); + b5 = VEC_LOAD_BE (abuf, 5, bige_const); + b6 = VEC_LOAD_BE (abuf, 6, bige_const); + b7 = VEC_LOAD_BE (abuf, 7, bige_const); - l = VEC_LOAD_BE (ocb_get_l (c, data_nblocks += 8), bige_const); + l = VEC_LOAD_BE (ocb_get_l (c, data_nblocks += 8), 0, bige_const); frkey = rkey0; iv ^= frkey; @@ -1558,15 +1819,15 @@ size_t _gcry_aes_ppc8_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, iv = iv7 ^ frkey; #define DO_ROUND(r) \ - rkey = ALIGNED_LOAD (&rk[r]); \ - b0 = vec_cipher_be (b0, rkey); \ - b1 = vec_cipher_be (b1, rkey); \ - b2 = vec_cipher_be (b2, rkey); \ - b3 = vec_cipher_be (b3, rkey); \ - b4 = vec_cipher_be (b4, rkey); \ - b5 = vec_cipher_be (b5, rkey); \ - b6 = vec_cipher_be (b6, rkey); \ - b7 = vec_cipher_be (b7, rkey); + rkey = ALIGNED_LOAD (rk, r); \ + b0 = asm_cipher_be (b0, rkey); \ + b1 = asm_cipher_be (b1, rkey); \ + b2 = asm_cipher_be (b2, rkey); \ + b3 = asm_cipher_be (b3, rkey); \ + b4 = asm_cipher_be (b4, rkey); \ + b5 = asm_cipher_be (b5, rkey); \ + b6 = asm_cipher_be (b6, rkey); \ + b7 = asm_cipher_be (b7, rkey); DO_ROUND(1); DO_ROUND(2); @@ -1591,14 +1852,14 @@ size_t _gcry_aes_ppc8_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, #undef DO_ROUND rkey = rkeylast; - b0 = vec_cipherlast_be (b0, rkey); - b1 = vec_cipherlast_be (b1, rkey); - b2 = vec_cipherlast_be (b2, rkey); - b3 = vec_cipherlast_be (b3, rkey); - b4 = vec_cipherlast_be (b4, rkey); - b5 = vec_cipherlast_be (b5, rkey); - b6 = vec_cipherlast_be (b6, rkey); - b7 = vec_cipherlast_be (b7, rkey); + b0 = asm_cipherlast_be (b0, rkey); + b1 = asm_cipherlast_be (b1, rkey); + b2 = asm_cipherlast_be (b2, rkey); + b3 = asm_cipherlast_be (b3, rkey); + b4 = asm_cipherlast_be (b4, rkey); + b5 = asm_cipherlast_be (b5, rkey); + b6 = asm_cipherlast_be (b6, rkey); + b7 = asm_cipherlast_be (b7, rkey); ctr ^= b0 ^ b1 ^ b2 ^ b3 ^ b4 ^ b5 ^ b6 ^ b7; @@ -1607,12 +1868,12 @@ size_t _gcry_aes_ppc8_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, if (nblocks >= 4 && (data_nblocks % 4) == 0) { - b0 = VEC_LOAD_BE (abuf + 0, bige_const); - b1 = VEC_LOAD_BE (abuf + 1, bige_const); - b2 = VEC_LOAD_BE (abuf + 2, bige_const); - b3 = VEC_LOAD_BE (abuf + 3, bige_const); + b0 = VEC_LOAD_BE (abuf, 0, bige_const); + b1 = VEC_LOAD_BE (abuf, 1, bige_const); + b2 = VEC_LOAD_BE (abuf, 2, bige_const); + b3 = VEC_LOAD_BE (abuf, 3, bige_const); - l = VEC_LOAD_BE (ocb_get_l (c, data_nblocks += 4), bige_const); + l = VEC_LOAD_BE (ocb_get_l (c, data_nblocks += 4), 0, bige_const); frkey = rkey0; iv ^= frkey; @@ -1629,11 +1890,11 @@ size_t _gcry_aes_ppc8_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, iv = iv3 ^ frkey; #define DO_ROUND(r) \ - rkey = ALIGNED_LOAD (&rk[r]); \ - b0 = vec_cipher_be (b0, rkey); \ - b1 = vec_cipher_be (b1, rkey); \ - b2 = vec_cipher_be (b2, rkey); \ - b3 = vec_cipher_be (b3, rkey); + rkey = ALIGNED_LOAD (rk, r); \ + b0 = asm_cipher_be (b0, rkey); \ + b1 = asm_cipher_be (b1, rkey); \ + b2 = asm_cipher_be (b2, rkey); \ + b3 = asm_cipher_be (b3, rkey); DO_ROUND(1); DO_ROUND(2); @@ -1658,10 +1919,10 @@ size_t _gcry_aes_ppc8_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, #undef DO_ROUND rkey = rkeylast; - b0 = vec_cipherlast_be (b0, rkey); - b1 = vec_cipherlast_be (b1, rkey); - b2 = vec_cipherlast_be (b2, rkey); - b3 = vec_cipherlast_be (b3, rkey); + b0 = asm_cipherlast_be (b0, rkey); + b1 = asm_cipherlast_be (b1, rkey); + b2 = asm_cipherlast_be (b2, rkey); + b3 = asm_cipherlast_be (b3, rkey); ctr ^= b0 ^ b1 ^ b2 ^ b3; @@ -1671,8 +1932,8 @@ size_t _gcry_aes_ppc8_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, for (; nblocks; nblocks--) { - l = VEC_LOAD_BE (ocb_get_l (c, ++data_nblocks), bige_const); - b = VEC_LOAD_BE (abuf, bige_const); + l = VEC_LOAD_BE (ocb_get_l (c, ++data_nblocks), 0, bige_const); + b = VEC_LOAD_BE (abuf, 0, bige_const); /* Offset_i = Offset_{i-1} xor L_{ntz(i)} */ iv ^= l; @@ -1684,8 +1945,8 @@ size_t _gcry_aes_ppc8_ocb_auth (gcry_cipher_hd_t c, void *abuf_arg, abuf += 1; } - VEC_STORE_BE (c->u_mode.ocb.aad_offset, iv, bige_const); - VEC_STORE_BE (c->u_mode.ocb.aad_sum, ctr, bige_const); + VEC_STORE_BE (c->u_mode.ocb.aad_offset, 0, iv, bige_const); + VEC_STORE_BE (c->u_mode.ocb.aad_sum, 0, ctr, bige_const); c->u_mode.ocb.aad_nblocks = data_nblocks; return 0; @@ -1696,44 +1957,59 @@ void _gcry_aes_ppc8_xts_crypt (void *context, unsigned char *tweak_arg, void *outbuf_arg, const void *inbuf_arg, size_t nblocks, int encrypt) { +#ifdef WORDS_BIGENDIAN static const block vec_bswap64_const = - { 7, 6, 5, 4, 3, 2, 1, 0, 15, 14, 13, 12, 11, 10, 9, 8 }; + { 8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, 3, 4, 5, 6, 7 }; static const block vec_bswap128_const = { 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0 }; +#else + static const block vec_bswap64_const = + { ~8, ~9, ~10, ~11, ~12, ~13, ~14, ~15, ~0, ~1, ~2, ~3, ~4, ~5, ~6, ~7 }; + static const block vec_bswap128_const = + { ~15, ~14, ~13, ~12, ~11, ~10, ~9, ~8, ~7, ~6, ~5, ~4, ~3, ~2, ~1, ~0 }; + static const block vec_tweakin_swap_const = + { ~12, ~13, ~14, ~15, ~8, ~9, ~10, ~11, ~4, ~5, ~6, ~7, ~0, ~1, ~2, ~3 }; +#endif static const unsigned char vec_tweak_const[16] = { 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0x87 }; static const vector unsigned long long vec_shift63_const = { 63, 63 }; static const vector unsigned long long vec_shift1_const = { 1, 1 }; - const block bige_const = vec_load_be_const(); + const block bige_const = asm_load_be_const(); RIJNDAEL_context *ctx = context; const u128_t *in = (const u128_t *)inbuf_arg; u128_t *out = (u128_t *)outbuf_arg; int rounds = ctx->rounds; - block tweak_tmp, tweak_next, tweak; - block b0, b1, b2, b3, b4, b5, b6, b7, b, rkey; + block tweak; + block b0, b1, b2, b3, b4, b5, b6, b7, b, rkey, rkeylf; block tweak0, tweak1, tweak2, tweak3, tweak4, tweak5, tweak6, tweak7; block tweak_const, bswap64_const, bswap128_const; vector unsigned long long shift63_const, shift1_const; ROUND_KEY_VARIABLES; - tweak_const = VEC_LOAD_BE (&vec_tweak_const, bige_const); - bswap64_const = ALIGNED_LOAD (&vec_bswap64_const); - bswap128_const = ALIGNED_LOAD (&vec_bswap128_const); - shift63_const = (vector unsigned long long)ALIGNED_LOAD (&vec_shift63_const); - shift1_const = (vector unsigned long long)ALIGNED_LOAD (&vec_shift1_const); + tweak_const = VEC_LOAD_BE (&vec_tweak_const, 0, bige_const); + bswap64_const = ALIGNED_LOAD (&vec_bswap64_const, 0); + bswap128_const = ALIGNED_LOAD (&vec_bswap128_const, 0); + shift63_const = (vector unsigned long long)ALIGNED_LOAD (&vec_shift63_const, 0); + shift1_const = (vector unsigned long long)ALIGNED_LOAD (&vec_shift1_const, 0); - tweak_next = VEC_LOAD_BE (tweak_arg, bige_const); +#ifdef WORDS_BIGENDIAN + tweak = VEC_LOAD_BE (tweak_arg, 0, bige_const); + tweak = asm_vperm1 (tweak, bswap128_const); +#else + tweak = VEC_LOAD_BE (tweak_arg, 0, vec_tweakin_swap_const); +#endif -#define GEN_TWEAK(tweak, tmp) /* Generate next tweak. */ \ - tmp = vec_vperm(tweak, tweak, bswap64_const); \ - tweak = vec_vperm(tweak, tweak, bswap128_const); \ - tmp = (block)(vec_sra((vector unsigned long long)tmp, shift63_const)) & \ - tweak_const; \ - tweak = (block)vec_sl((vector unsigned long long)tweak, shift1_const); \ - tweak = tweak ^ tmp; \ - tweak = vec_vperm(tweak, tweak, bswap128_const); +#define GEN_TWEAK(tout, tin) /* Generate next tweak. */ \ + do { \ + block tmp1, tmp2; \ + tmp1 = asm_vperm1((tin), bswap64_const); \ + tmp2 = (block)vec_sl((vector unsigned long long)(tin), shift1_const); \ + tmp1 = (block)(vec_sra((vector unsigned long long)tmp1, shift63_const)) & \ + tweak_const; \ + tout = asm_xor(tmp1, tmp2); \ + } while (0) if (encrypt) { @@ -1743,42 +2019,70 @@ void _gcry_aes_ppc8_xts_crypt (void *context, unsigned char *tweak_arg, for (; nblocks >= 8; nblocks -= 8) { - tweak0 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak1 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak2 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak3 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak4 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak5 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak6 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak7 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - - b0 = VEC_LOAD_BE (in + 0, bige_const) ^ tweak0 ^ rkey0; - b1 = VEC_LOAD_BE (in + 1, bige_const) ^ tweak1 ^ rkey0; - b2 = VEC_LOAD_BE (in + 2, bige_const) ^ tweak2 ^ rkey0; - b3 = VEC_LOAD_BE (in + 3, bige_const) ^ tweak3 ^ rkey0; - b4 = VEC_LOAD_BE (in + 4, bige_const) ^ tweak4 ^ rkey0; - b5 = VEC_LOAD_BE (in + 5, bige_const) ^ tweak5 ^ rkey0; - b6 = VEC_LOAD_BE (in + 6, bige_const) ^ tweak6 ^ rkey0; - b7 = VEC_LOAD_BE (in + 7, bige_const) ^ tweak7 ^ rkey0; + b0 = VEC_LOAD_BE_NOSWAP (in, 0); + b1 = VEC_LOAD_BE_NOSWAP (in, 1); + b2 = VEC_LOAD_BE_NOSWAP (in, 2); + b3 = VEC_LOAD_BE_NOSWAP (in, 3); + tweak0 = tweak; + GEN_TWEAK (tweak1, tweak0); + tweak0 = asm_vperm1 (tweak0, bswap128_const); + b4 = VEC_LOAD_BE_NOSWAP (in, 4); + b5 = VEC_LOAD_BE_NOSWAP (in, 5); + GEN_TWEAK (tweak2, tweak1); + tweak1 = asm_vperm1 (tweak1, bswap128_const); + b6 = VEC_LOAD_BE_NOSWAP (in, 6); + b7 = VEC_LOAD_BE_NOSWAP (in, 7); + in += 8; + + b0 = VEC_BE_SWAP(b0, bige_const); + b1 = VEC_BE_SWAP(b1, bige_const); + GEN_TWEAK (tweak3, tweak2); + tweak2 = asm_vperm1 (tweak2, bswap128_const); + GEN_TWEAK (tweak4, tweak3); + tweak3 = asm_vperm1 (tweak3, bswap128_const); + b2 = VEC_BE_SWAP(b2, bige_const); + b3 = VEC_BE_SWAP(b3, bige_const); + GEN_TWEAK (tweak5, tweak4); + tweak4 = asm_vperm1 (tweak4, bswap128_const); + GEN_TWEAK (tweak6, tweak5); + tweak5 = asm_vperm1 (tweak5, bswap128_const); + b4 = VEC_BE_SWAP(b4, bige_const); + b5 = VEC_BE_SWAP(b5, bige_const); + GEN_TWEAK (tweak7, tweak6); + tweak6 = asm_vperm1 (tweak6, bswap128_const); + GEN_TWEAK (tweak, tweak7); + tweak7 = asm_vperm1 (tweak7, bswap128_const); + b6 = VEC_BE_SWAP(b6, bige_const); + b7 = VEC_BE_SWAP(b7, bige_const); + + tweak0 = asm_xor (tweak0, rkey0); + tweak1 = asm_xor (tweak1, rkey0); + tweak2 = asm_xor (tweak2, rkey0); + tweak3 = asm_xor (tweak3, rkey0); + tweak4 = asm_xor (tweak4, rkey0); + tweak5 = asm_xor (tweak5, rkey0); + tweak6 = asm_xor (tweak6, rkey0); + tweak7 = asm_xor (tweak7, rkey0); + + b0 = asm_xor (b0, tweak0); + b1 = asm_xor (b1, tweak1); + b2 = asm_xor (b2, tweak2); + b3 = asm_xor (b3, tweak3); + b4 = asm_xor (b4, tweak4); + b5 = asm_xor (b5, tweak5); + b6 = asm_xor (b6, tweak6); + b7 = asm_xor (b7, tweak7); #define DO_ROUND(r) \ - rkey = ALIGNED_LOAD (&rk[r]); \ - b0 = vec_cipher_be (b0, rkey); \ - b1 = vec_cipher_be (b1, rkey); \ - b2 = vec_cipher_be (b2, rkey); \ - b3 = vec_cipher_be (b3, rkey); \ - b4 = vec_cipher_be (b4, rkey); \ - b5 = vec_cipher_be (b5, rkey); \ - b6 = vec_cipher_be (b6, rkey); \ - b7 = vec_cipher_be (b7, rkey); + rkey = ALIGNED_LOAD (rk, r); \ + b0 = asm_cipher_be (b0, rkey); \ + b1 = asm_cipher_be (b1, rkey); \ + b2 = asm_cipher_be (b2, rkey); \ + b3 = asm_cipher_be (b3, rkey); \ + b4 = asm_cipher_be (b4, rkey); \ + b5 = asm_cipher_be (b5, rkey); \ + b6 = asm_cipher_be (b6, rkey); \ + b7 = asm_cipher_be (b7, rkey); DO_ROUND(1); DO_ROUND(2); @@ -1787,7 +2091,20 @@ void _gcry_aes_ppc8_xts_crypt (void *context, unsigned char *tweak_arg, DO_ROUND(5); DO_ROUND(6); DO_ROUND(7); + + rkeylf = asm_xor (rkeylast, rkey0); + DO_ROUND(8); + + tweak0 = asm_xor (tweak0, rkeylf); + tweak1 = asm_xor (tweak1, rkeylf); + tweak2 = asm_xor (tweak2, rkeylf); + tweak3 = asm_xor (tweak3, rkeylf); + tweak4 = asm_xor (tweak4, rkeylf); + tweak5 = asm_xor (tweak5, rkeylf); + tweak6 = asm_xor (tweak6, rkeylf); + tweak7 = asm_xor (tweak7, rkeylf); + DO_ROUND(9); if (rounds >= 12) { @@ -1802,51 +2119,62 @@ void _gcry_aes_ppc8_xts_crypt (void *context, unsigned char *tweak_arg, #undef DO_ROUND - rkey = rkeylast; - b0 = vec_cipherlast_be (b0, rkey ^ tweak0); - b1 = vec_cipherlast_be (b1, rkey ^ tweak1); - b2 = vec_cipherlast_be (b2, rkey ^ tweak2); - b3 = vec_cipherlast_be (b3, rkey ^ tweak3); - b4 = vec_cipherlast_be (b4, rkey ^ tweak4); - b5 = vec_cipherlast_be (b5, rkey ^ tweak5); - b6 = vec_cipherlast_be (b6, rkey ^ tweak6); - b7 = vec_cipherlast_be (b7, rkey ^ tweak7); - - VEC_STORE_BE (out + 0, b0, bige_const); - VEC_STORE_BE (out + 1, b1, bige_const); - VEC_STORE_BE (out + 2, b2, bige_const); - VEC_STORE_BE (out + 3, b3, bige_const); - VEC_STORE_BE (out + 4, b4, bige_const); - VEC_STORE_BE (out + 5, b5, bige_const); - VEC_STORE_BE (out + 6, b6, bige_const); - VEC_STORE_BE (out + 7, b7, bige_const); - - in += 8; + b0 = asm_cipherlast_be (b0, tweak0); + b1 = asm_cipherlast_be (b1, tweak1); + b2 = asm_cipherlast_be (b2, tweak2); + b3 = asm_cipherlast_be (b3, tweak3); + b0 = VEC_BE_SWAP (b0, bige_const); + b1 = VEC_BE_SWAP (b1, bige_const); + b4 = asm_cipherlast_be (b4, tweak4); + b5 = asm_cipherlast_be (b5, tweak5); + b2 = VEC_BE_SWAP (b2, bige_const); + b3 = VEC_BE_SWAP (b3, bige_const); + b6 = asm_cipherlast_be (b6, tweak6); + b7 = asm_cipherlast_be (b7, tweak7); + VEC_STORE_BE_NOSWAP (out, 0, b0); + VEC_STORE_BE_NOSWAP (out, 1, b1); + b4 = VEC_BE_SWAP (b4, bige_const); + b5 = VEC_BE_SWAP (b5, bige_const); + VEC_STORE_BE_NOSWAP (out, 2, b2); + VEC_STORE_BE_NOSWAP (out, 3, b3); + b6 = VEC_BE_SWAP (b6, bige_const); + b7 = VEC_BE_SWAP (b7, bige_const); + VEC_STORE_BE_NOSWAP (out, 4, b4); + VEC_STORE_BE_NOSWAP (out, 5, b5); + VEC_STORE_BE_NOSWAP (out, 6, b6); + VEC_STORE_BE_NOSWAP (out, 7, b7); out += 8; } if (nblocks >= 4) { - tweak0 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak1 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak2 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak3 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - - b0 = VEC_LOAD_BE (in + 0, bige_const) ^ tweak0 ^ rkey0; - b1 = VEC_LOAD_BE (in + 1, bige_const) ^ tweak1 ^ rkey0; - b2 = VEC_LOAD_BE (in + 2, bige_const) ^ tweak2 ^ rkey0; - b3 = VEC_LOAD_BE (in + 3, bige_const) ^ tweak3 ^ rkey0; + tweak0 = tweak; + GEN_TWEAK (tweak1, tweak0); + GEN_TWEAK (tweak2, tweak1); + GEN_TWEAK (tweak3, tweak2); + GEN_TWEAK (tweak, tweak3); + + b0 = VEC_LOAD_BE (in, 0, bige_const); + b1 = VEC_LOAD_BE (in, 1, bige_const); + b2 = VEC_LOAD_BE (in, 2, bige_const); + b3 = VEC_LOAD_BE (in, 3, bige_const); + + tweak0 = asm_vperm1 (tweak0, bswap128_const); + tweak1 = asm_vperm1 (tweak1, bswap128_const); + tweak2 = asm_vperm1 (tweak2, bswap128_const); + tweak3 = asm_vperm1 (tweak3, bswap128_const); + + b0 ^= tweak0 ^ rkey0; + b1 ^= tweak1 ^ rkey0; + b2 ^= tweak2 ^ rkey0; + b3 ^= tweak3 ^ rkey0; #define DO_ROUND(r) \ - rkey = ALIGNED_LOAD (&rk[r]); \ - b0 = vec_cipher_be (b0, rkey); \ - b1 = vec_cipher_be (b1, rkey); \ - b2 = vec_cipher_be (b2, rkey); \ - b3 = vec_cipher_be (b3, rkey); + rkey = ALIGNED_LOAD (rk, r); \ + b0 = asm_cipher_be (b0, rkey); \ + b1 = asm_cipher_be (b1, rkey); \ + b2 = asm_cipher_be (b2, rkey); \ + b3 = asm_cipher_be (b3, rkey); DO_ROUND(1); DO_ROUND(2); @@ -1871,15 +2199,15 @@ void _gcry_aes_ppc8_xts_crypt (void *context, unsigned char *tweak_arg, #undef DO_ROUND rkey = rkeylast; - b0 = vec_cipherlast_be (b0, rkey ^ tweak0); - b1 = vec_cipherlast_be (b1, rkey ^ tweak1); - b2 = vec_cipherlast_be (b2, rkey ^ tweak2); - b3 = vec_cipherlast_be (b3, rkey ^ tweak3); + b0 = asm_cipherlast_be (b0, rkey ^ tweak0); + b1 = asm_cipherlast_be (b1, rkey ^ tweak1); + b2 = asm_cipherlast_be (b2, rkey ^ tweak2); + b3 = asm_cipherlast_be (b3, rkey ^ tweak3); - VEC_STORE_BE (out + 0, b0, bige_const); - VEC_STORE_BE (out + 1, b1, bige_const); - VEC_STORE_BE (out + 2, b2, bige_const); - VEC_STORE_BE (out + 3, b3, bige_const); + VEC_STORE_BE (out, 0, b0, bige_const); + VEC_STORE_BE (out, 1, b1, bige_const); + VEC_STORE_BE (out, 2, b2, bige_const); + VEC_STORE_BE (out, 3, b3, bige_const); in += 4; out += 4; @@ -1888,18 +2216,18 @@ void _gcry_aes_ppc8_xts_crypt (void *context, unsigned char *tweak_arg, for (; nblocks; nblocks--) { - tweak = tweak_next; + tweak0 = asm_vperm1 (tweak, bswap128_const); /* Xor-Encrypt/Decrypt-Xor block. */ - b = VEC_LOAD_BE (in, bige_const) ^ tweak; + b = VEC_LOAD_BE (in, 0, bige_const) ^ tweak0; /* Generate next tweak. */ - GEN_TWEAK (tweak_next, tweak_tmp); + GEN_TWEAK (tweak, tweak); AES_ENCRYPT (b, rounds); - b ^= tweak; - VEC_STORE_BE (out, b, bige_const); + b ^= tweak0; + VEC_STORE_BE (out, 0, b, bige_const); in++; out++; @@ -1919,42 +2247,70 @@ void _gcry_aes_ppc8_xts_crypt (void *context, unsigned char *tweak_arg, for (; nblocks >= 8; nblocks -= 8) { - tweak0 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak1 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak2 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak3 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak4 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak5 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak6 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak7 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - - b0 = VEC_LOAD_BE (in + 0, bige_const) ^ tweak0 ^ rkey0; - b1 = VEC_LOAD_BE (in + 1, bige_const) ^ tweak1 ^ rkey0; - b2 = VEC_LOAD_BE (in + 2, bige_const) ^ tweak2 ^ rkey0; - b3 = VEC_LOAD_BE (in + 3, bige_const) ^ tweak3 ^ rkey0; - b4 = VEC_LOAD_BE (in + 4, bige_const) ^ tweak4 ^ rkey0; - b5 = VEC_LOAD_BE (in + 5, bige_const) ^ tweak5 ^ rkey0; - b6 = VEC_LOAD_BE (in + 6, bige_const) ^ tweak6 ^ rkey0; - b7 = VEC_LOAD_BE (in + 7, bige_const) ^ tweak7 ^ rkey0; + b0 = VEC_LOAD_BE_NOSWAP (in, 0); + b1 = VEC_LOAD_BE_NOSWAP (in, 1); + b2 = VEC_LOAD_BE_NOSWAP (in, 2); + b3 = VEC_LOAD_BE_NOSWAP (in, 3); + tweak0 = tweak; + GEN_TWEAK (tweak1, tweak0); + tweak0 = asm_vperm1 (tweak0, bswap128_const); + b4 = VEC_LOAD_BE_NOSWAP (in, 4); + b5 = VEC_LOAD_BE_NOSWAP (in, 5); + GEN_TWEAK (tweak2, tweak1); + tweak1 = asm_vperm1 (tweak1, bswap128_const); + b6 = VEC_LOAD_BE_NOSWAP (in, 6); + b7 = VEC_LOAD_BE_NOSWAP (in, 7); + in += 8; + + b0 = VEC_BE_SWAP(b0, bige_const); + b1 = VEC_BE_SWAP(b1, bige_const); + GEN_TWEAK (tweak3, tweak2); + tweak2 = asm_vperm1 (tweak2, bswap128_const); + GEN_TWEAK (tweak4, tweak3); + tweak3 = asm_vperm1 (tweak3, bswap128_const); + b2 = VEC_BE_SWAP(b2, bige_const); + b3 = VEC_BE_SWAP(b3, bige_const); + GEN_TWEAK (tweak5, tweak4); + tweak4 = asm_vperm1 (tweak4, bswap128_const); + GEN_TWEAK (tweak6, tweak5); + tweak5 = asm_vperm1 (tweak5, bswap128_const); + b4 = VEC_BE_SWAP(b4, bige_const); + b5 = VEC_BE_SWAP(b5, bige_const); + GEN_TWEAK (tweak7, tweak6); + tweak6 = asm_vperm1 (tweak6, bswap128_const); + GEN_TWEAK (tweak, tweak7); + tweak7 = asm_vperm1 (tweak7, bswap128_const); + b6 = VEC_BE_SWAP(b6, bige_const); + b7 = VEC_BE_SWAP(b7, bige_const); + + tweak0 = asm_xor (tweak0, rkey0); + tweak1 = asm_xor (tweak1, rkey0); + tweak2 = asm_xor (tweak2, rkey0); + tweak3 = asm_xor (tweak3, rkey0); + tweak4 = asm_xor (tweak4, rkey0); + tweak5 = asm_xor (tweak5, rkey0); + tweak6 = asm_xor (tweak6, rkey0); + tweak7 = asm_xor (tweak7, rkey0); + + b0 = asm_xor (b0, tweak0); + b1 = asm_xor (b1, tweak1); + b2 = asm_xor (b2, tweak2); + b3 = asm_xor (b3, tweak3); + b4 = asm_xor (b4, tweak4); + b5 = asm_xor (b5, tweak5); + b6 = asm_xor (b6, tweak6); + b7 = asm_xor (b7, tweak7); #define DO_ROUND(r) \ - rkey = ALIGNED_LOAD (&rk[r]); \ - b0 = vec_ncipher_be (b0, rkey); \ - b1 = vec_ncipher_be (b1, rkey); \ - b2 = vec_ncipher_be (b2, rkey); \ - b3 = vec_ncipher_be (b3, rkey); \ - b4 = vec_ncipher_be (b4, rkey); \ - b5 = vec_ncipher_be (b5, rkey); \ - b6 = vec_ncipher_be (b6, rkey); \ - b7 = vec_ncipher_be (b7, rkey); + rkey = ALIGNED_LOAD (rk, r); \ + b0 = asm_ncipher_be (b0, rkey); \ + b1 = asm_ncipher_be (b1, rkey); \ + b2 = asm_ncipher_be (b2, rkey); \ + b3 = asm_ncipher_be (b3, rkey); \ + b4 = asm_ncipher_be (b4, rkey); \ + b5 = asm_ncipher_be (b5, rkey); \ + b6 = asm_ncipher_be (b6, rkey); \ + b7 = asm_ncipher_be (b7, rkey); DO_ROUND(1); DO_ROUND(2); @@ -1963,7 +2319,20 @@ void _gcry_aes_ppc8_xts_crypt (void *context, unsigned char *tweak_arg, DO_ROUND(5); DO_ROUND(6); DO_ROUND(7); + + rkeylf = asm_xor (rkeylast, rkey0); + DO_ROUND(8); + + tweak0 = asm_xor (tweak0, rkeylf); + tweak1 = asm_xor (tweak1, rkeylf); + tweak2 = asm_xor (tweak2, rkeylf); + tweak3 = asm_xor (tweak3, rkeylf); + tweak4 = asm_xor (tweak4, rkeylf); + tweak5 = asm_xor (tweak5, rkeylf); + tweak6 = asm_xor (tweak6, rkeylf); + tweak7 = asm_xor (tweak7, rkeylf); + DO_ROUND(9); if (rounds >= 12) { @@ -1978,51 +2347,62 @@ void _gcry_aes_ppc8_xts_crypt (void *context, unsigned char *tweak_arg, #undef DO_ROUND - rkey = rkeylast; - b0 = vec_ncipherlast_be (b0, rkey ^ tweak0); - b1 = vec_ncipherlast_be (b1, rkey ^ tweak1); - b2 = vec_ncipherlast_be (b2, rkey ^ tweak2); - b3 = vec_ncipherlast_be (b3, rkey ^ tweak3); - b4 = vec_ncipherlast_be (b4, rkey ^ tweak4); - b5 = vec_ncipherlast_be (b5, rkey ^ tweak5); - b6 = vec_ncipherlast_be (b6, rkey ^ tweak6); - b7 = vec_ncipherlast_be (b7, rkey ^ tweak7); - - VEC_STORE_BE (out + 0, b0, bige_const); - VEC_STORE_BE (out + 1, b1, bige_const); - VEC_STORE_BE (out + 2, b2, bige_const); - VEC_STORE_BE (out + 3, b3, bige_const); - VEC_STORE_BE (out + 4, b4, bige_const); - VEC_STORE_BE (out + 5, b5, bige_const); - VEC_STORE_BE (out + 6, b6, bige_const); - VEC_STORE_BE (out + 7, b7, bige_const); - - in += 8; + b0 = asm_ncipherlast_be (b0, tweak0); + b1 = asm_ncipherlast_be (b1, tweak1); + b2 = asm_ncipherlast_be (b2, tweak2); + b3 = asm_ncipherlast_be (b3, tweak3); + b0 = VEC_BE_SWAP (b0, bige_const); + b1 = VEC_BE_SWAP (b1, bige_const); + b4 = asm_ncipherlast_be (b4, tweak4); + b5 = asm_ncipherlast_be (b5, tweak5); + b2 = VEC_BE_SWAP (b2, bige_const); + b3 = VEC_BE_SWAP (b3, bige_const); + b6 = asm_ncipherlast_be (b6, tweak6); + b7 = asm_ncipherlast_be (b7, tweak7); + VEC_STORE_BE_NOSWAP (out, 0, b0); + VEC_STORE_BE_NOSWAP (out, 1, b1); + b4 = VEC_BE_SWAP (b4, bige_const); + b5 = VEC_BE_SWAP (b5, bige_const); + VEC_STORE_BE_NOSWAP (out, 2, b2); + VEC_STORE_BE_NOSWAP (out, 3, b3); + b6 = VEC_BE_SWAP (b6, bige_const); + b7 = VEC_BE_SWAP (b7, bige_const); + VEC_STORE_BE_NOSWAP (out, 4, b4); + VEC_STORE_BE_NOSWAP (out, 5, b5); + VEC_STORE_BE_NOSWAP (out, 6, b6); + VEC_STORE_BE_NOSWAP (out, 7, b7); out += 8; } if (nblocks >= 4) { - tweak0 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak1 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak2 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - tweak3 = tweak_next; - GEN_TWEAK (tweak_next, tweak_tmp); - - b0 = VEC_LOAD_BE (in + 0, bige_const) ^ tweak0 ^ rkey0; - b1 = VEC_LOAD_BE (in + 1, bige_const) ^ tweak1 ^ rkey0; - b2 = VEC_LOAD_BE (in + 2, bige_const) ^ tweak2 ^ rkey0; - b3 = VEC_LOAD_BE (in + 3, bige_const) ^ tweak3 ^ rkey0; + tweak0 = tweak; + GEN_TWEAK (tweak1, tweak0); + GEN_TWEAK (tweak2, tweak1); + GEN_TWEAK (tweak3, tweak2); + GEN_TWEAK (tweak, tweak3); + + b0 = VEC_LOAD_BE (in, 0, bige_const); + b1 = VEC_LOAD_BE (in, 1, bige_const); + b2 = VEC_LOAD_BE (in, 2, bige_const); + b3 = VEC_LOAD_BE (in, 3, bige_const); + + tweak0 = asm_vperm1 (tweak0, bswap128_const); + tweak1 = asm_vperm1 (tweak1, bswap128_const); + tweak2 = asm_vperm1 (tweak2, bswap128_const); + tweak3 = asm_vperm1 (tweak3, bswap128_const); + + b0 ^= tweak0 ^ rkey0; + b1 ^= tweak1 ^ rkey0; + b2 ^= tweak2 ^ rkey0; + b3 ^= tweak3 ^ rkey0; #define DO_ROUND(r) \ - rkey = ALIGNED_LOAD (&rk[r]); \ - b0 = vec_ncipher_be (b0, rkey); \ - b1 = vec_ncipher_be (b1, rkey); \ - b2 = vec_ncipher_be (b2, rkey); \ - b3 = vec_ncipher_be (b3, rkey); + rkey = ALIGNED_LOAD (rk, r); \ + b0 = asm_ncipher_be (b0, rkey); \ + b1 = asm_ncipher_be (b1, rkey); \ + b2 = asm_ncipher_be (b2, rkey); \ + b3 = asm_ncipher_be (b3, rkey); DO_ROUND(1); DO_ROUND(2); @@ -2047,15 +2427,15 @@ void _gcry_aes_ppc8_xts_crypt (void *context, unsigned char *tweak_arg, #undef DO_ROUND rkey = rkeylast; - b0 = vec_ncipherlast_be (b0, rkey ^ tweak0); - b1 = vec_ncipherlast_be (b1, rkey ^ tweak1); - b2 = vec_ncipherlast_be (b2, rkey ^ tweak2); - b3 = vec_ncipherlast_be (b3, rkey ^ tweak3); + b0 = asm_ncipherlast_be (b0, rkey ^ tweak0); + b1 = asm_ncipherlast_be (b1, rkey ^ tweak1); + b2 = asm_ncipherlast_be (b2, rkey ^ tweak2); + b3 = asm_ncipherlast_be (b3, rkey ^ tweak3); - VEC_STORE_BE (out + 0, b0, bige_const); - VEC_STORE_BE (out + 1, b1, bige_const); - VEC_STORE_BE (out + 2, b2, bige_const); - VEC_STORE_BE (out + 3, b3, bige_const); + VEC_STORE_BE (out, 0, b0, bige_const); + VEC_STORE_BE (out, 1, b1, bige_const); + VEC_STORE_BE (out, 2, b2, bige_const); + VEC_STORE_BE (out, 3, b3, bige_const); in += 4; out += 4; @@ -2064,25 +2444,30 @@ void _gcry_aes_ppc8_xts_crypt (void *context, unsigned char *tweak_arg, for (; nblocks; nblocks--) { - tweak = tweak_next; + tweak0 = asm_vperm1 (tweak, bswap128_const); /* Xor-Encrypt/Decrypt-Xor block. */ - b = VEC_LOAD_BE (in, bige_const) ^ tweak; + b = VEC_LOAD_BE (in, 0, bige_const) ^ tweak0; /* Generate next tweak. */ - GEN_TWEAK (tweak_next, tweak_tmp); + GEN_TWEAK (tweak, tweak); AES_DECRYPT (b, rounds); - b ^= tweak; - VEC_STORE_BE (out, b, bige_const); + b ^= tweak0; + VEC_STORE_BE (out, 0, b, bige_const); in++; out++; } } - VEC_STORE_BE (tweak_arg, tweak_next, bige_const); +#ifdef WORDS_BIGENDIAN + tweak = asm_vperm1 (tweak, bswap128_const); + VEC_STORE_BE (tweak_arg, 0, tweak, bige_const); +#else + VEC_STORE_BE (tweak_arg, 0, tweak, vec_tweakin_swap_const); +#endif #undef GEN_TWEAK }