2

A general advice on a number of threads in a thread pool for CPU intensive work is to have at most one thread for a logical CPU. On JVM it is a common practice to use Runtime.getRuntime().availableProcessors(), this is also use in Scala Execution Contexts and Task Support to determine default level of concurrency, see:

private[parallel] final class FutureTasks(executor: ExecutionContext) extends Tasks {
  //....
  def parallelismLevel = Runtime.getRuntime.availableProcessors
}

Recently some CPUs introduced a concept of efficient and performance cores. Does this change the advice in any way? It is considered beneficial for performance to utilize all cores including the efficient ones when maximizing throughput of CPU intensive tasks, or should the thread pool be limited to performance cores only? Are there any APIs which would allow the application to query about heterogenous CPU configurations like this?

I am mostly interested about JVM, but native APIs for Windows, Linux or MacOS could be also interesting. I am not interested in a thread control like discussed in How to detect P/E-Core in Intel Alder Lake CPU?, only in a general system capability information similiar to availableProcessors, but including some details about non-uniform architectures.

1 Answer 1

1

You have asked lots of questions, but you started with this:

A general advice on a number of threads in a thread pool for CPU intensive work is to have at most one thread for a logical CPU. On JVM it is a common practice to use Runtime.getRuntime().availableProcessors() ...

  1. The "general advice" you talk about is just a rule of thumb. Better advice is to tune the number of threads to optimize ... whatever it is you are trying to optimize. Note that the number of threads in the pool(s) will be just one of many things that could be tuned.

  2. In fact, I'm not convinced that it is "common practice" to use code like that. In fact, I suspect it is more common to specify the number of threads in a config file, or (as is the case of the common thread pool) let the JVM decide, however it decides.

Recently some CPUs introduced a concept of efficient and performance cores. Does this change the advice in any way?

Ummm ... you would have to consult the person or site that gave you the advice.

But in general I it wouldn't affect the advice that I would give.

As a general rule, there are many factors that can affect throughput of an application. For example, the nature of the application is just as important as the number of threads; e.g.

  • what the Java threads do?
  • how they interact with each other?
  • how do they interact with memory?
  • how do the interact with the OS and/or external services?

In reality, there are too many factors that are too difficult to quantify for any rule of thumb to work.

It is considered beneficial for performance to utilize all cores including the efficient ones when maximizing throughput of CPU intensive tasks, or should the thread pool be limited to performance cores only?

I don't know if anyone has tried to model this theoretically or measure this empirically for Java applications. But given the simplistic nature of the modelling that gives us "one threads per core", I doubt one could come up with either a theoretical or empirical model that would be predictive.

Are there any APIs which would allow the application to query about heterogeneous CPU configurations like this?

Not that I am aware of.

And I'm not sure that you would be able to make much use of the information anyway ... in a Java application.

Sign up to request clarification or add additional context in comments.

1 Comment

Plus, it doesn't even make sense to talk about p- and e-cores in a general sense because their implementation is so different. Intel uses completely different cores with different µ-arch, different ISAs, and different features (e.g. p-cores don't support SMT), whereas AMD uses practically identical cores, just with different clock speeds, power targets, and cache sizes. All the latencies, issue widths, and features are identical between AMD's p- and e-cores. That means, for Intel's p- and e-cores, you even need different compiler settings and optimizations.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.