Skip to content
  • Sam Tunnicliffe's avatar
    Avoid blocking AntiEntropyStage when submitting validation requests · 04533e6c
    Sam Tunnicliffe authored
    Patch by Sam Tunnicliffe; reviewed by Benjamin Lerer for CASSANDRA-15812
    
    Switches ValidationExecutor's work queue to LinkedBlockingQueue to
    avoid blocking AntiEntropyStage when the executor is saturated. This
    requires VE.corePoolSize to be set to concurrent_validations as now
    it will always prefer to queue requests rather than start new threads.
    
    This commit also adds a hard limit on concurrent_validations, as allowing
    an unbounded number of validations to run concurrently is never safe.
    This was always true, but setting a high value here is now more
    dangerous as it controls the number of core, not max, threads.
    This hard limit is linked to concurrent_compactors, so operators may
    set concurrent_validations between 1 and concurrent_compactors.
    The meaning of setting it < 1 has changed from "unbounded" to
    "whatever concurrent_compactors is set to".
    
    This safety valve can be overridden with a system property at startup
    and/or a JMX property.
    
    CASSANDRA-9292 removed the 1hr timeout on prepare messages, but this
    was inadvertently undone when CASSANDRA-13397 was committed. As nothing
    long running is done in the repair phase anymore, this timeout can
    safely be reduced.
    
    If using RepairCommandPoolFullStrategy.queue, the core pool size
    for repairCommandExecutor must be increased from the default
    value of 1 or else all concurrent tasks will be queued and no
    more threads created.
    04533e6c