Large Scale DSO Theory

The parametric analysis command in Desktop computes simulation results as a function of model parameters, such as geometry dimensions, material properties and excitations. The parametric analysis is either performed on a local machine, where each variation is analyzed serially by a single engine, or distributed across machines through a DSO license. Desktop's DSO analysis runs multiple engines in parallel, thus generating results in a shorter time. In the Regular DSO algorithm, the parametric analysis job (Desktop) runs on master node, which in turn launches one or more distributed-parallel engines on each machine allocated to the job. Desktop distributes parametric variations among these engines running in parallel. As variations are solved, the progress/messages and variation results are sent back to Desktop, where they are persisted into the common results database on master node, as illustrated below:

A single engine can now span multiple machines with auto multi-level DSO.

Regular DSO Bottleneck

As per above illustration, Regular DSO's speedup is limited by the resources of the centralized 'Desktop' bottleneck. It's been observed that DSO becomes unreliable at a certain point, as the number of engines and number of variations is increased. The term ‘large scale parallel’ can be used to define this tipping point. For a given model, a ‘large-scale parallel’ job denotes scenarios, in terms of the number of distributed-parallel engines and the number of parametric variations, where the Regular DSO runs into centralized bottlenecks that result in one or both of: progressively smaller speedups, unreliability.

With the advent of economical availability and timely provisioning of compute resources, product designers have access to large compute clusters to run their simulations. And they are throwing larger and larger number of compute resources at simulation jobs, in order to obtain results faster. The parametric DSO needs to meet this challenge and target linear speedup for 'large-scale parallel' jobs. The Large Scale DSO feature is targeted toward 100% reliability and linear speedup of large-scale parallel DSO jobs.

Key Algorithms/Concepts for Large Scale DSO

Redistribution Feature