Effectively targeting cloud resources using Hierarchical Computation Graphs for large-scale, high-fidelity simulation.
Hierarchical Computation Graphs
Sedaro presents scenario developers with a highly flexible interface for defining simulations and their agents, components, and variables, as well as the relationships and constraints between them. While this declarative information is formalized in the user-facing Sedaro Modeling Language (SedaroML), we translate it to an Intermediate Representation (IR) in order to capture the operational information required to simulate these models. This post will cover the layers of information that our IR represents and the ways it enables us to fully utilize cloud resources during simulation.
Sedaro's simulation platform aims to support a wide variety of engineering domains and enable interactions between multiple simulation engines, which is called co-simulation. Our runtime system must simulate both the low-level interactions of components within an agent as well as the high-level communications among the many agents in a scenario. To this end, our Intermediate Representation is based on computation graphs with explicit hierarchy and multiplexing, which we call Hierarchical Computation Graphs or HCGs.
HCGs are composed of three key elements: wires holding pieces of simulation state, children (which may represent either "primitive" functions for computing state or whole other HCGs), and links between them. I won't be getting into representation details here, but the interesting information is in the metadata attached to these elements throughout the hierarchy. Some of this information has an effect on the abstract simulation itself and will affect the data persisted to the Data Service and presented in the front end web interface. Other information is purely operational and used to control the concrete execution of the simulation. HCGs can be composed and combined algebraically, which lets us split up this metadata into various layers. By varying the purely operational layers, we can transparently experiment with different ways of running simulations to optimally target cloud computation resources without affecting the outputs of our simulations.
HCG Layers
Here are the main layers currently used to compose full simulations:
Structural Layer: This includes the hierarchical information explained above, containing named wires and child HCGs, the links between them, and choices of which wires are inputs and/or outputs. It also includes information about which HCGs in the hierarchy represent "agents" which must have 'position' and 'attitude' outputs or "engines" which must have a 'time' output.
Semantic Layer: This layer gives semantics to the simulation by attaching types to wires and pure functions to "primitive" HCGs. The pure functions are called state managers or unit converters and they compute a primitive HCG's outputs from its inputs. The layer may also contain other information which affects the output of the simulation, such as time delays on links for avoiding dependency cycles and functors for wire values such as intervals or probability distributions.
Initialization Layer: The initial values of simulation variables can be given in their own layer, so that many otherwise-identical simulations with differing initial conditions can be easily built. This layer also contains a simulation-wide seed for deterministic random number generation, which is valuable for reproducible Monte-Carlo studies. Missing initial values are inferred and added to this layer.
The remaining layers are purely operational and don't change simulation output:
Architectural Layer: Sedaro's simulations can be run in-cloud, on-premise, or hybrid, and this is the layer that specifies how to distribute components across available compute resources. It may, for example, explicitly assign compute instances for a child HCG, specify that an HCG should run in its own process or thread, or use an actor framework to abstract over these choices. Experimenting with different architectural layers can help manage the tradeoffs between parallelization and network overhead. This layer also specifies interfaces external to the simulation, such as external co-simulators or data targets to which state should be persisted. Networking may also be configured, such as enabling encryption.
Implementation Layer: Sedaro's runtime can run in either interpreted or compiled mode, which can be specified in this layer. Additional optimizations like inlining, constant folding, or transpilation to Rust may be specified. Our compiler is actually generated from our interpreter which helps enforce that this layer has no effect on simulation outputs, but that's a topic for a future blog post.
Temporal Layer: Our simulations may be run at full speed, match real world speed, or sync with real world time.
Dev Layer: Development features may be enabled here such as verbose logging and error reporting, profiling, and runtime type assertion.
Simulation at Scale
Our HCGs are built by combining our users' models and state with our modeling + simulation libraries of meta-models, state managers, units, and converters. By translating all aspects of our simulations into a simple homogenous Intermediate Representation, we're able to perform common compiler build-time operations on the diagrams. Validation steps include checking for state which is consumed but not produced, finding cycles which will cause deadlock, and type-checking. Optimization steps include inference of effective architectures, conversion of hub-and-spoke networks to point-to-point, static batching of messages inferred from multiplexers, and type inference.
As Sedaro branches out into new simulation domains and starts incorporating third party modeling + simulation libraries, new execution strategies will need to be explored. However, the Hierarchical Compute Graphs should be flexible enough to represent these new domains and execution strategies with the addition of new metadata fields.
Sedaro’s HCG architecture gives us a playground for endless experimentation, and we’ve already made massive strides in simulation scale, flexibility, and multi-physics complexity. I'm excited for the next new optimization, language, or hardware target that will take our simulations to the next level.
Kommentit