Main Content

When to Run Statistical Functions in Parallel

Why Run in Parallel?

The main reason to run statistical computations in parallel is to gain speed, meaning to reduce the execution time of your program or functions.Factors Affecting Speeddiscusses the main items affecting the speed of programs or functions.Factors Affecting Resultsdiscusses details that can cause a parallel run to give different results than a serial run.

Note

Some Statistics and Machine Learning Toolbox™ functions have built-in parallel computing capabilities. See快速启动统计和机器学习工具箱的平行计算. You can also use any Statistics and Machine Learning Toolbox functions with Parallel Computing Toolbox™ functions such asparforloops. To decide when to call functions in parallel, consider the factors affecting speed and results.

Factors Affecting Speed

某些可能影响并行处理速度的因素是:

  • Parallel environment setup. It takes time to runparpoolto begin computing in parallel. If your computation is fast, the setup time can exceed any time saved by computing in parallel.

  • Parallel overhead. There is overhead in communication and coordination when running in parallel. If function evaluations are fast, this overhead could be an appreciable part of the total computation time. Thus, solving a problem in parallel can be slower than solving the problem serially. For an example, seeImproving Optimization Performance with Parallel Computingin MATLAB®Digest, March 2009.

  • No nestedparforloops. This is described inWorking with parfor.parfordoes not work in parallel when called from within anotherparforloop. If you have programmed your custom functions to take advantage of parallel processing, the limitation of no nestedparfor循环会导致并行函数的运行速度慢于预期。

  • When executing serially,parforloops run slightly slower thanforloops.

  • Passing parameters. Parameters are automatically passed to worker sessions during the execution of parallel computations. If there are many parameters, or they take a large amount of memory, passing parameters can slow the execution of your computation.

  • Contention for resources: network and computing. If the pool of workers has low bandwidth or high latency, parallel computation can be slow.

Factors Affecting Results

Some factors can affect results when using parallel processing. You might need to adjust your code to run in parallel, for example, you need independent loops and the workers must be able to access the variables. Some important factors are:

  • Persistent or global variables. If any functions use persistent or global variables, these variables can take different values on different worker processors. The body of aparfor循环不能包含全局或持续变量声明。

  • 访问外部文件。第一版的顺序ations is not guaranteed during parallel processing, so external files can be accessed in unpredictable order, leading to unpredictable results. Furthermore, if multiple processors try to read an external file simultaneously, the file can become locked, leading to a read error, and halting function execution.

  • Noncomputational functions, such asinput,plot, and键盘, can behave badly when used in your custom functions. Do not use these functions in aparforloop, because they can cause a worker to become nonresponsive, since it is waiting for input.

  • parfordoes not allowbreak或者returnstatements.

  • The random numbers you use can affect the results of your computations. See并行统计计算中的可重复性.

有关转换循环使用的建议parfor, see平行的前面(PARFOR)(Parallel Computing Toolbox).