locking/pvqspinlock: Allow limited lock stealing

This patch allows one attempt for the lock waiter to steal the lock when entering the PV slowpath. To prevent lock starvation, the pending bit will be set by the queue head vCPU when it is in the active lock spinning loop to disable any lock stealing attempt. This helps to reduce the performance penalty caused by lock waiter preemption while not having much of the downsides of a real unfair lock. The pv_wait_head() function was renamed as pv_wait_head_or_lock() as it was modified to acquire the lock before returning. This is necessary because of possible lock stealing attempts from other tasks. Linux kernel builds were run in KVM guest on an 8-socket, 4 cores/socket Westmere-EX system and a 4-socket, 8 cores/socket Haswell-EX system. Both systems are configured to have 32 physical CPUs. The kernel build times before and after the patch were: Westmere Haswell Patch 32 vCPUs 48 vCPUs 32 vCPUs 48 vCPUs ----- -------- -------- -------- -------- Before patch 3m15.6s 10m56.1s 1m44.1s 5m29.1s After patch 3m02.3s 5m00.2s 1m43.7s 3m03.5s For the overcommited case (48 vCPUs), this patch is able to reduce kernel build time by more than 54% for Westmere and 44% for Haswell. Signed-off-by: Waiman Long <Waiman.Long@hpe.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Douglas Hatch <doug.hatch@hpe.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Scott J Norton <scott.norton@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1447190336-53317-1-git-send-email-Waiman.Long@hpe.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
author: Waiman Long <Waiman.Long@hpe.com> 2015-11-10 16:18:56 -0500
committer: Ingo Molnar <mingo@kernel.org> 2015-12-04 11:39:51 +0100
commit: 1c4941fd53afb46ab15826628e4819866d008a28 (patch)
tree: eaf0f35a79725bf10476a90634245824a7ccb8c3 /kernel/locking/qspinlock_stat.h
parent: 45e898b735620f426eddf105fc886d2966593a58 (diff)
download: linux-1c4941fd53afb46ab15826628e4819866d008a28.tar.gz
linux-1c4941fd53afb46ab15826628e4819866d008a28.tar.xz
1 files changed, 16 insertions, 0 deletions
diff --git a/kernel/locking/qspinlock_stat.h b/kernel/locking/qspinlock_stat.h
index b1553adec2e7..94d4533fe984 100644
--- a/kernel/locking/qspinlock_stat.h
+++ b/kernel/locking/qspinlock_stat.h
@@ -22,6 +22,7 @@
  *   pv_kick_wake	- # of vCPU kicks used for computing pv_latency_wake
  *   pv_latency_kick	- average latency (ns) of vCPU kick operation
  *   pv_latency_wake	- average latency (ns) from vCPU kick to wakeup
+ *   pv_lock_stealing	- # of lock stealing operations
  *   pv_spurious_wakeup	- # of spurious wakeups
  *   pv_wait_again	- # of vCPU wait's that happened after a vCPU kick
  *   pv_wait_head	- # of vCPU wait's at the queue head
@@ -43,6 +44,7 @@ enum qlock_stats {
 	qstat_pv_kick_wake,
 	qstat_pv_latency_kick,
 	qstat_pv_latency_wake,
+	qstat_pv_lock_stealing,
 	qstat_pv_spurious_wakeup,
 	qstat_pv_wait_again,
 	qstat_pv_wait_head,
@@ -66,6 +68,7 @@ static const char * const qstat_names[qstat_num + 1] = {
 	[qstat_pv_spurious_wakeup] = "pv_spurious_wakeup",
 	[qstat_pv_latency_kick]	   = "pv_latency_kick",
 	[qstat_pv_latency_wake]    = "pv_latency_wake",
+	[qstat_pv_lock_stealing]   = "pv_lock_stealing",
 	[qstat_pv_wait_again]      = "pv_wait_again",
 	[qstat_pv_wait_head]       = "pv_wait_head",
 	[qstat_pv_wait_node]       = "pv_wait_node",
@@ -273,6 +276,19 @@ static inline void __pv_wait(u8 *ptr, u8 val)
 #define pv_kick(c)	__pv_kick(c)
 #define pv_wait(p, v)	__pv_wait(p, v)
 
+/*
+ * PV unfair trylock count tracking function
+ */
+static inline int qstat_spin_steal_lock(struct qspinlock *lock)
+{
+	int ret = pv_queued_spin_steal_lock(lock);
+
+	qstat_inc(qstat_pv_lock_stealing, ret);
+	return ret;
+}
+#undef  queued_spin_trylock
+#define queued_spin_trylock(l)	qstat_spin_steal_lock(l)
+
 #else /* CONFIG_QUEUED_LOCK_STAT */
 
 static inline void qstat_inc(enum qlock_stats stat, bool cond)	{ }
author	Waiman Long <Waiman.Long@hpe.com>	2015-11-10 16:18:56 -0500
committer	Ingo Molnar <mingo@kernel.org>	2015-12-04 11:39:51 +0100
commit	1c4941fd53afb46ab15826628e4819866d008a28 (patch)
tree	eaf0f35a79725bf10476a90634245824a7ccb8c3 /kernel/locking/qspinlock_stat.h
parent	45e898b735620f426eddf105fc886d2966593a58 (diff)
download	linux-1c4941fd53afb46ab15826628e4819866d008a28.tar.gz linux-1c4941fd53afb46ab15826628e4819866d008a28.tar.xz