[PATCH 0/5] dilithium-kyber: Optimized (i)NTT support for
Danny Tsen
dtsen at us.ibm.com
Mon Mar 2 03:19:29 CET 2026
Hi Werner,
I do some modification for the ML-KEM format. Here is the raw performance number for ML-KEM NTT. Hope this help.
Thanks.
-Danny
[16:33] danny at ltcden12-lp1 mlkem-ipcri % ./perf_mlkem_test
=== Optimized assembly NTT test
cpu_time_used (sec)=0.016707
loops=100000
-->ops / sec = 5985515.053570
=== Original C NTT test
cpu_time_used (sec)=0.107232
loops=100000
-->ops / sec = 932557.445539
-->Optimized improvement over original = 5.418388
-->Optimized speed over original faster = 6.418388
=== Optimized Assembly Inverse NTT test
cpu_time_used (sec)=0.031500
loops=100000
-->ops / sec = 3174603.174603
=== Original C Inverse NTT test
cpu_time_used (sec)=0.138457
loops=100000
-->ops / sec = 722245.895838
-->Optimized improvement over original = 3.395460
-->Optimized speed over original faster = 4.395460
________________________________
From: Gcrypt-devel <gcrypt-devel-bounces at gnupg.org> on behalf of Danny Tsen via Gcrypt-devel <gcrypt-devel at gnupg.org>
Sent: Monday, March 2, 2026 9:37 AM
To: Werner Koch <wk at gnupg.org>; Danny Tsen via Gcrypt-devel <gcrypt-devel at gnupg.org>
Subject: [EXTERNAL] RE: [PATCH 0/5] dilithium-kyber: Optimized (i)NTT support for
Hi Werner, For some reason, I can't display your message. I got to display it now. I don't have a good comparison performance format for ML-KEM. But here is the raw performance number for MLDSA. Thanks. -Danny [15: 47] danny@ ltcden12-lp1 mldsa-ntt_tests
Hi Werner,
For some reason, I can't display your message. I got to display it now. I don't have a good comparison performance format for ML-KEM. But here is the raw performance number for MLDSA.
Thanks.
-Danny
[15:47] danny at ltcden12-lp1 mldsa-ntt_tests % ./perf_mldsa_ntt_opt
=== Optimized assembly NTT test
cpu_time_used (sec)=0.046582
loops=100000
-->ops / sec = 2146751.964278
=== Original C NTT test
cpu_time_used (sec)=0.229215
loops=100000
-->ops / sec = 436271.622712
-->Optimized improvement over original = 3.920678
-->Optimized speed over original faster = 4.920678
=== Optimized Assembly Inverse NTT test
cpu_time_used (sec)=0.052021
loops=100000
-->ops / sec = 1922300.609369
=== Original C Inverse NTT test
cpu_time_used (sec)=0.270790
loops=100000
-->ops / sec = 369289.855608
-->Optimized improvement over original = 4.205398
-->Optimized speed over original faster = 5.205398
________________________________
From: Werner Koch
Sent: Thursday, February 26, 2026 9:47 PM
To: Danny Tsen via Gcrypt-devel
Cc: Danny Tsen
Subject: [EXTERNAL] Re: [PATCH 0/5] dilithium-kyber: Optimized (i)NTT support for
On Thu, 26 Feb 2026 10:23, Danny Tsen said:
> I don't have benchmark for libgcrypt. I do have my own testing
> performance number on NTT operation. That probably not what you are
I just noticed that we do have support for MLKEM and MLDSA in our
./bench-slope . We should change that to make it easier torun
benchmarks.
I was actually looking only for a rough figure on how much performance
you gain with your patches.
Salam-Shalom,
Werner
--
The pioneers of a warless world are the youth that
refuse military service. - A. Einstein
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.gnupg.org/pipermail/gcrypt-devel/attachments/20260302/d71056e0/attachment-0001.html>
More information about the Gcrypt-devel
mailing list