-
Sam Tunnicliffe authored
Patch by Sam Tunnicliffe; reviewed by Benjamin Lerer for CASSANDRA-15812 Switches ValidationExecutor's work queue to LinkedBlockingQueue to avoid blocking AntiEntropyStage when the executor is saturated. This requires VE.corePoolSize to be set to concurrent_validations as now it will always prefer to queue requests rather than start new threads. This commit also adds a hard limit on concurrent_validations, as allowing an unbounded number of validations to run concurrently is never safe. This was always true, but setting a high value here is now more dangerous as it controls the number of core, not max, threads. This hard limit is linked to concurrent_compactors, so operators may set concurrent_validations between 1 and concurrent_compactors. The meaning of setting it < 1 has changed from "unbounded" to "whatever concurrent_compactors is set to". This safety valve can be overridden with a system property at startup and/or a JMX property. CASSANDRA-9292 removed the 1hr timeout on prepare messages, but this was inadvertently undone when CASSANDRA-13397 was committed. As nothing long running is done in the repair phase anymore, this timeout can safely be reduced. If using RepairCommandPoolFullStrategy.queue, the core pool size for repairCommandExecutor must be increased from the default value of 1 or else all concurrent tasks will be queued and no more threads created.
04533e6c