Underscore prefix detection fix
Gregor Riepl
seto-kun at freesurf.ch
Sun Jul 29 01:28:52 CEST 2007
> The other oddity is, when I build with i586 assembly, the checks
> run _slower_ than in i386 mode.
> I get 1min 19sec vs. 2min 14sec on a MacBook CoreDuo 1.83GHz with
> 1GB RAM.
> Even when using aggressive optimisation (CFLAGS="-arch i586 -
> march=yonah -O3 -ffast-math -mfpmath=sse -msse -msse2"), I still
> only get 1min 47secs. For i386, I didn't use any special compiler
> flags.
>
> What are me and my Mac messing up here?
I think I've found the problem.
In mpi/config.links, there's a rule for i586-* that sets the macro
ELF_SYNTAX in asm-syntax.h. This in turn causes the assembler to see
the line
.align (1<<3)
in front of the Loop: label in mpih-sub1-asm.S and mpih-add1-asm.S.
At least with the Apple assembler, this will be interpreted as "align
the next instruction on a 2^(1<<3) boundary" - which is BSD syntax.
I'm not quite sure, but I thought I read somewhere that this 2^(align
size) type syntax is even used in recent gas versions? In any case,
the 1<<(1<<3) = 0x100 = 256 byte alignment produces 200+ nops, which
slow the routine down considerably.
I fixed this by adding the darwin triplets to the djgpp triplets in
config.links:
i[3467]86*-msdosdjgpp* | \
i[34]86*-apple-darwin*)
echo '#define BSD_SYNTAX' >>./mpi/asm-syntax.h
cat $srcdir/mpi/i386/syntax.h >>./mpi/asm-syntax.h
path="i386"
;;
i586*-msdosdjgpp* | \
i[567]86*-apple-darwin*)
echo '#define BSD_SYNTAX' >>./mpi/asm-syntax.h
cat $srcdir/mpi/i386/syntax.h >>./mpi/asm-syntax.h
path="i586 i386"
;;
This takes out the nops - but it's still slower.
Using the aggressive optimisation flags mentioned earlier, i386
assembly lets benchmark run in 49secs, and in 68secs with i586 assembly.
Disabling assembly yields 65secs by the way.
I think I give up on this for now - it's fast enough and I'm happy
that gcrypt builds with a little bit of speed improvement on OSX. :)
Thanks for all your work,
Gregor
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2482 bytes
Desc: not available
Url : /pipermail/attachments/20070729/ce7f2327/attachment-0001.bin
More information about the Gcrypt-devel
mailing list