我正在尝试编写一个线程安全的已排序单链表.我写了两个版本:粗粒度同步和细粒度同步.以下是两种实现:
细粒度:
public void add(T t) {
Node curr = head;
curr.lock.lock();
while (curr.next != null) {
// Invariant: curr is locked
// Invariant: curr.data curr.next.lock.lock();
if (t.compareTo(curr.next.data) <&#61; 0) {
break;
}
Node tmp &#61; curr.next;
curr.lock.unlock();
curr &#61; tmp;
}
// curr is acquired
curr.next &#61; new Node(curr.next, t);
if (curr.next.next !&#61; null) { // old curr&#39;s next is acquired
curr.next.next.lock.unlock();
}
curr.lock.unlock();
}
粗粒度&#xff1a;
public void add(T t) {
lock.lock();
Node curr &#61; head;
while (curr.next !&#61; null) {
if (t.compareTo(curr.next.data) <&#61; 0) {
break;
}
curr &#61; curr.next;
}
curr.next &#61; new Node(curr.next, t);
lock.unlock();
}
我将4个线程(在4个逻辑CPU核心上)的两个版本定时插入20000个整数.每个线程的时间显示CPU时间(即它不包括等待时间).
Fine grained:
Worked 1 spent 1080 ms
Worked 2 spent 1230 ms
Worked 0 spent 1250 ms
Worked 3 spent 1260 ms
wall time: 1620 ms
Coarse grained:
Worked 1 spent 190 ms
Worked 2 spent 270 ms
Worked 3 spent 410 ms
Worked 0 spent 280 ms
wall time: 1298 ms
我最初的想法是.lock()和.unlock()是问题,但我分析了实现,他们一起只消耗了30&#xff05;的时间.我的第二个猜测是,细粒度的解决方案有更多的缓存未命中,但我怀疑它,因为单个链表与数组不同,本质上容易出现缓存未命中.
知道为什么我没有得到预期的并行化吗&#xff1f;