作者:Devon_Mars | 来源:互联网 | 2022-10-15 17:54
我正在比较关于sync.Mutex和Go频道的性能。这是我的基准:
// go playground: https://play.golang.org/p/f_u9jHBq_Jc
const (
start = 300 // actual = start * goprocs
end = 600 // actual = end * goprocs
step = 10
)
var goprocs = runtime.GOMAXPROCS(0) // 8
// https://perf.golang.org/search?q=upload:20190819.3
func BenchmarkChanWrite(b *testing.B) {
var v int64
ch := make(chan int, 1)
ch <- 1
for i := start; i
性能比较可视化如下:
是什么原因
当goroutine的数量超过大约3400时,sync.Mutex的性能下降很大吗?
Go频道非常稳定,但比同步慢。
Benchstat的原始工作台数据(进行测试-bench =。-count = 5)go version go1.12.4 linux/amd64
:
MutexWrite/goroutines-2400-8 48.6ns ± 1%
MutexWrite/goroutines-2480-8 49.1ns ± 0%
MutexWrite/goroutines-2560-8 49.7ns ± 1%
MutexWrite/goroutines-2640-8 50.5ns ± 3%
MutexWrite/goroutines-2720-8 50.9ns ± 2%
MutexWrite/goroutines-2800-8 51.8ns ± 3%
MutexWrite/goroutines-2880-8 52.5ns ± 2%
MutexWrite/goroutines-2960-8 54.1ns ± 4%
MutexWrite/goroutines-3040-8 54.5ns ± 2%
MutexWrite/goroutines-3120-8 56.1ns ± 3%
MutexWrite/goroutines-3200-8 63.2ns ± 5%
MutexWrite/goroutines-3280-8 77.5ns ± 6%
MutexWrite/goroutines-3360-8 141ns ± 6%
MutexWrite/goroutines-3440-8 239ns ± 8%
MutexWrite/goroutines-3520-8 248ns ± 3%
MutexWrite/goroutines-3600-8 254ns ± 2%
MutexWrite/goroutines-3680-8 256ns ± 1%
MutexWrite/goroutines-3760-8 261ns ± 2%
MutexWrite/goroutines-3840-8 266ns ± 3%
MutexWrite/goroutines-3920-8 276ns ± 3%
MutexWrite/goroutines-4000-8 278ns ± 3%
MutexWrite/goroutines-4080-8 286ns ± 5%
MutexWrite/goroutines-4160-8 293ns ± 4%
MutexWrite/goroutines-4240-8 295ns ± 2%
MutexWrite/goroutines-4320-8 280ns ± 8%
MutexWrite/goroutines-4400-8 294ns ± 9%
MutexWrite/goroutines-4480-8 285ns ±10%
MutexWrite/goroutines-4560-8 290ns ± 8%
MutexWrite/goroutines-4640-8 271ns ± 3%
MutexWrite/goroutines-4720-8 271ns ± 4%
ChanWrite/goroutines-2400-8 158ns ± 3%
ChanWrite/goroutines-2480-8 159ns ± 2%
ChanWrite/goroutines-2560-8 161ns ± 2%
ChanWrite/goroutines-2640-8 161ns ± 1%
ChanWrite/goroutines-2720-8 163ns ± 1%
ChanWrite/goroutines-2800-8 166ns ± 3%
ChanWrite/goroutines-2880-8 168ns ± 1%
ChanWrite/goroutines-2960-8 176ns ± 4%
ChanWrite/goroutines-3040-8 176ns ± 2%
ChanWrite/goroutines-3120-8 180ns ± 1%
ChanWrite/goroutines-3200-8 180ns ± 1%
ChanWrite/goroutines-3280-8 181ns ± 2%
ChanWrite/goroutines-3360-8 183ns ± 2%
ChanWrite/goroutines-3440-8 188ns ± 3%
ChanWrite/goroutines-3520-8 190ns ± 2%
ChanWrite/goroutines-3600-8 193ns ± 2%
ChanWrite/goroutines-3680-8 196ns ± 3%
ChanWrite/goroutines-3760-8 199ns ± 2%
ChanWrite/goroutines-3840-8 206ns ± 2%
ChanWrite/goroutines-3920-8 209ns ± 2%
ChanWrite/goroutines-4000-8 206ns ± 2%
ChanWrite/goroutines-4080-8 209ns ± 2%
ChanWrite/goroutines-4160-8 208ns ± 2%
ChanWrite/goroutines-4240-8 209ns ± 3%
ChanWrite/goroutines-4320-8 213ns ± 2%
ChanWrite/goroutines-4400-8 209ns ± 2%
ChanWrite/goroutines-4480-8 211ns ± 1%
ChanWrite/goroutines-4560-8 213ns ± 2%
ChanWrite/goroutines-4640-8 215ns ± 1%
ChanWrite/goroutines-4720-8 218ns ± 3%
转到1.12.4。硬件:
CPU: Quad core Intel Core i7-7700 (-MT-MCP-) cache: 8192 KB
clock speeds: max: 4200 MHz 1: 1109 MHz 2: 3641 MHz 3: 3472 MHz 4: 3514 MHz 5: 3873 MHz 6: 3537 MHz
7: 3410 MHz 8: 3016 MHz
CPU Flags: 3dnowprefetch abm acpi adx aes aperfmperf apic arat arch_perfmon art avx avx2 bmi1 bmi2
bts clflush clflushopt cmov constant_tsc cpuid cpuid_fault cx16 cx8 de ds_cpl dtes64 dtherm dts epb
ept erms est f16c flexpriority flush_l1d fma fpu fsgsbase fxsr hle ht hwp hwp_act_window hwp_epp
hwp_notify ibpb ibrs ida intel_pt invpcid invpcid_single lahf_lm lm mca mce md_clear mmx monitor
movbe mpx msr mtrr nonstop_tsc nopl nx pae pat pbe pcid pclmulqdq pdcm pdpe1gb pebs pge pln pni
popcnt pse pse36 pti pts rdrand rdseed rdtscp rep_good rtm sdbg sep smap smep smx ss ssbd sse sse2
sse4_1 sse4_2 ssse3 stibp syscall tm tm2 tpr_shadow tsc tsc_adjust tsc_deadline_timer tsc_known_freq
vme vmx vnmi vpid x2apic xgetbv1 xsave xsavec xsaveopt xsaves xtopology xtpr
更新:我在不同的硬件上进行了测试。看来问题仍然存在:
板凳:https : //play.golang.org/p/HnQ44--E4UQ
更新:
我的完整基准测试从8个goroutine到15000个goroutine,包括chan / sync.Mutex / atomic的比较: