A general advice on a number of threads in a thread pool for CPU intensive work is to have at most one thread for a logical CPU. On JVM it is a common practice to use Runtime.getRuntime().availableProcessors()
, this is also use in Scala Execution Contexts and Task Support to determine default level of concurrency, see:
https://www.scala-lang.org/api/2.12.2/scala/collection/parallel/index.html#availableProcessors:Int
scala parallel Tasks:
private[parallel] final class FutureTasks(executor: ExecutionContext) extends Tasks {
//....
def parallelismLevel = Runtime.getRuntime.availableProcessors
}
Recently some CPUs introduced a concept of efficient and performance cores. Does this change the advice in any way? It is considered beneficial for performance to utilize all cores including the efficient ones when maximizing throughput of CPU intensive tasks, or should the thread pool be limited to performance cores only? Are there any APIs which would allow the application to query about heterogenous CPU configurations like this?
I am mostly interested about JVM, but native APIs for Windows, Linux or MacOS could be also interesting. I am not interested in a thread control like discussed in How to detect P/E-Core in Intel Alder Lake CPU?, only in a general system capability information similiar to availableProcessors
, but including some details about non-uniform architectures.