如 qat 或者 cpu multi buffer
1
Juszoe 2022-07-07 14:22:27 +08:00
我看 crypto 包下的源码,一般都是用的汇编封装好函数,go 再调用
|
2
0o0O0o0O0o 2022-07-07 14:25:47 +08:00
|
3
onetown 2022-07-07 14:30:32 +08:00
cgo 直接调用 c 函数就行了
|
4
lysS 2022-07-08 09:26:16 +08:00
有,用的汇编指令
可以跑 crypto/cipher/benchmark_test.go 的 bench 看一下 |
5
dzdh OP @lysS
``` cpu: Intel(R) Core(TM) i5-9400F CPU @ 2.90GHz BenchmarkAESGCM/Open-128-64-6 13946174 85.62 ns/op 747.52 MB/s 0 B/op 0 allocs/op BenchmarkAESGCM/Seal-128-64-6 11389120 104.6 ns/op 611.97 MB/s 0 B/op 0 allocs/op BenchmarkAESGCM/Open-256-64-6 12128635 98.54 ns/op 649.48 MB/s 0 B/op 0 allocs/op BenchmarkAESGCM/Seal-256-64-6 10137523 116.3 ns/op 550.22 MB/s 0 B/op 0 allocs/op BenchmarkAESGCM/Open-128-1350-6 3813538 312.5 ns/op 4320.03 MB/s 0 B/op 0 allocs/op BenchmarkAESGCM/Seal-128-1350-6 3184155 375.1 ns/op 3599.46 MB/s 0 B/op 0 allocs/op BenchmarkAESGCM/Open-256-1350-6 2947183 409.8 ns/op 3294.57 MB/s 0 B/op 0 allocs/op BenchmarkAESGCM/Seal-256-1350-6 2575678 463.2 ns/op 2914.70 MB/s 0 B/op 0 allocs/op BenchmarkAESGCM/Open-128-8192-6 857240 1379 ns/op 5940.85 MB/s 0 B/op 0 allocs/op BenchmarkAESGCM/Seal-128-8192-6 705914 1695 ns/op 4833.48 MB/s 0 B/op 0 allocs/op BenchmarkAESGCM/Open-256-8192-6 630681 1906 ns/op 4298.59 MB/s 0 B/op 0 allocs/op BenchmarkAESGCM/Seal-256-8192-6 546186 2160 ns/op 3793.24 MB/s 0 B/op 0 allocs/op BenchmarkAESCFBEncrypt1K-6 798397 1497 ns/op 680.68 MB/s BenchmarkAESCFBDecrypt1K-6 921312 1260 ns/op 808.48 MB/s BenchmarkAESCFBDecrypt8K-6 118597 9964 ns/op 821.68 MB/s BenchmarkAESOFB1K-6 1331312 894.8 ns/op 1138.78 MB/s BenchmarkAESCTR1K-6 1000000 1130 ns/op 901.82 MB/s BenchmarkAESCTR8K-6 136078 8650 ns/op 946.43 MB/s BenchmarkAESCBCEncrypt1K-6 1232721 970.3 ns/op 1055.38 MB/s BenchmarkAESCBCDecrypt1K-6 1311703 911.8 ns/op 1123.09 MB/s PASS ``` 咋理解 |
6
ihciah 2022-07-08 10:17:50 +08:00 via iPhone
默认应该不支持吧,不然 binary 扔 amd 不就炸了。感觉这事得问 intel ,用它的库。
|
7
lysS 2022-07-08 11:00:52 +08:00
BenchmarkAESGCM/Open-128-1350-6 3813538 312.5 ns/op 4320.03 MB/s 0 B/op 0 allocs/op
为例: 这是 AES 加密的 gcm 模式的解密,密钥为 128 位的,解密数据块大小位 1350 字节,跑这个 Bench 总共跑了 3813538 次,平均每次 300ns, 解密速度达到 4GB/s ,每次操作分配了 0 字节内存,分配了 0 次内存 |
9
lysS 2022-07-08 12:29:12 +08:00
肯定有的,没有那岂不太弱鸡了
|
10
blless 2022-07-08 18:34:03 +08:00
@ihciah Go 是直接面向 CPU 架构编译的,特定架构是肯定会包含一些关键指令集的。特殊加速指令集就不知道了,就算不能直接用加速代码一般也会有纯算法实现的替代。
Go1.18 的时候出了架构细分,可以指定架构等级 https://github.com/golang/go/wiki/MinimumRequirements#amd64 Go 1.18 introduced 4 architectural levels for AMD64. Each level differs in the set of x86 instructions that the compiler can include in the generated binaries: GOAMD64=v1 (default): The baseline. Exclusively generates instructions that all 64-bit x86 processors can execute. GOAMD64=v2: all v1 instructions, plus CMPXCHG16B, LAHF, SAHF, POPCNT, SSE3, SSE4.1, SSE4.2, SSSE3. GOAMD64=v3: all v2 instructions, plus AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, OSXSAVE. GOAMD64=v4: all v3 instructions, plus AVX512F, AVX512BW, AVX512CD, AVX512DQ, AVX512VL. |