Architecture and Design¶

We’ll start with the C4 views:

Context
Container – this isn’t too interesting, but it can help to see this.
Components

This is a collection of various design notes describing some implementation details.

The code view is in the API Reference section.

Context¶

There are two distinct contexts for CEL Python:

The CLI – as a stand-alone application.
As an importable module to provide expressions to a DSL.

From the CLI, the celpy application has a number of use cases:

A shell script can use celpy as a command to replace other shell commands, including expr, test, and jq.
A person can run celpy interactively. This allows experimentation. It also supports exploring very complex JSON documents to understand their structure.

As a library, an application (for example, C7N) can import celpy to provide an expression feature for the DSL. This provides well-defined semantics, and widely-used syntax for the expression language. There’s an explicit separation between building a program and executing the program to allow caching an expression for multiple executions without the overhead of building a Lark parser or compiling the expression.

Container¶

As a CLI, this is part of a shell script. It runs where the script runs.

As a library, this is improted into the application to extend the DSL.

There are no services offered or used.

Components¶

The Python code base has a number of modules.

__init__ – the celpy package as a whole.
__main__ – the main applications used when running celpy.
celparser – a Facade for the Lark parser.
evaluation – a Facade for run-time evaluation.
celtypes – the underlying Python implementations of CEL data structures.
c7nlib– a collection of components the C7N can use to introduce CEL filters.
adapter – Some JSON serialization components.

Here’s the conceptual organiation

$@startuml package celpy { component "~__init__" as init component "~__main__" as main component adapter component c7nlib component celparser component celtypes component evaluation component cel.lark } init --> celtypes init --> adapter init --> celparser init--> evaluation main --> init main --> celparser main --> adapter main --> evaluation adapter --> celtypes c7nlib --> evaluation c7nlib --> adapter c7nlib --> celtypes c7nlib --> init celparser --> cel.lark celparser --> lark evaluation --> lark evaluation --> celtypes package lark { } @enduml$

While there is a tangle of dependencies, there are three top-level “entry points” for celpy.

The __main__ module is the CLI application.
The c7nlib module exposes CEL functionality in a form usable by Cloud Custodian filter definitions. This library provides useful components to perform Custodian-related computations.
The __init__ module is exposes the most useful parts of celpy for integration woth another application.

Compile-Time¶

Here are the essential classes used to compile a CEL expression and prepare it for evaluation.

The fundamental sequence of operations is

Create an celpy.Environment with any needed celpy.Annotation instances. For the most part, these are based on the overall application domain. Any type definitions should be subclasses of celpy.TypeType or a callable function defined by the celpy.CELFunction type.
Use the celpy.Environment to compile the CEL text to create a parse tree.
Use the celpy.Environment to create a celpy.Runner instance from the parse tree and any function definitions that override or extend the predefined CEL environment.
Evaluate the celpy.Runner with a celpy.Context. The celpy.Context provides specific values for variables required for evaluation. Generally, each variable should have an celpy.Annotation defined in the celpy.Environment.

The celpy.Runner can be evaluated with any number of distinct celpy.Context values. This amortizes the cost of compilation over multiple executions.

Evaluation-Time¶

Here’s the classes to evaluate a CEL expression.

The evalation of the CEL expression is done via a celpy.Runner object. There are two celpy.Runner implementations.

The celpy.InterpretedRunner walks the AST, creating the final result celpy.Value or celpy.CELEvalError exception. This uses a celpy.evaluation.Activation to perform the evaluation.
The celpy.CompiledRunner transpiles the AST into a Python sequence of statements. The internal compile() creates a code object that can then be evaluated with a given celpy.evaluation.Activation The internal exec() functions performs the evaluation.

The subclasses of celpy.Runner are Adapter classes to provide a tidy interface to the somewhat more complex celpy.Evaluator or celpy.Transpiler objects. In the case of the celpy.InterpretedRunner, evaluation involves creating an celpy.evaluation.Activation and visiting the AST. Whereas, the celpy.CompiledRunner must first visit the AST to create code. At evaluation time, it create an celpy.evaluation.Activation and uses exec() to compute the final value.

The celpy.evaluation.Activation contains several things:

The Annotation definitions to provide type information for identifiers.
The CELFunction functions that extend or override the built-in functions.
The values for identifiers.

The celpy.evaluation.Activation is a kind of chainmap for name resolution. The chain has the following structure:

The end of the chain has the built-in defaults. (This is the bottom-most base definition.)
A layer on top of this can offer types and functions which are provided to integrate into the containing app or framework.
The next layer is the “current” activation when evaluating a given expression. For the CLI, this has the command-line variables. For other integrations, these are the input values.
A transient layer on top of this is used to create a local variable binding for the macro evaluations. These can be nested, and introduce the macro variable as a temporary annotation and value binding.

CEL Types¶

There are ten extension types that wrap Python built-in types to provide the unique CEL semantics.

celtypes.TypeType is a supertype for CEL types.
celtypes.BoolType wraps int and creates additional type overload exceptions.
celtypes.BytesType wraps bytes it handles conversion from celtypes.StringType.
celtypes.DoubleType wraps float and creates additional type overload exceptions.
celtypes.IntType wraps int and adds a 64-bit signed range constraint.
celtypes.UintType wraps int and adds a 64-bit unsigned range constraint.
celtypes.ListType wraps list and includes some type overload exceptions.
celtypes.MapType wraps dict and includes some type overload exceptions. Additionally, the MapKeyTypes type hint is the subset of types permitted as keys.
celtypes.StringType wraps str and includes some type overload exceptions.
celtypes.TimestampType wraps datetime.datetime and includes a number of conversions from datetime.datetime, int, and str values.
celtypes.DurationType wraps datetime.timedelta and includes a number of conversions from datetime.timedelta, int, and str values.

Additionally, a celtypes.NullType is defined, but does not seem to be needed. It hasn’t been deleted, yet. It should be considered deprecated.

Transpiler Missing Names¶

The member_dot transpilation with a missing name will be found at evaluation time via member.get('IDENT'). This raises No Such Member in Mapping error.

The primary :: ident evaluation can result in one of the following conditions:

ident denotes a type definition. The value’s type is TypeType. The value is a type reference bool becomes celpy.celtypes.BoolType.

ident denotes a built-in function. The value’s type is CELFunction. The value is the Python function reference.

ident denotes an annotation, but the value’s type is neither TypeType nor CELFunction.

The transpiled value is f"activation.{ident}", assuming it will be a defined variable.

If, at exec() time the name is not in the Activation with a value, a NameError exception will be raised that becomes a CELEvalError exception.

The Member-Dot Production¶

Consider protobuf_message{field: 42}.field. This is parsed using the following productions.

member         : member_dot | member_dot_arg | member_item | member_object | primary
member_dot     : member "." IDENT
member_object  : member "{" [fieldinits] "}"

The member_object will be a primary which can be an ident. It MUST refer to the Annotation (not the value) because it has fieldinits. All other choices are (generally) values. They can be annotations, which means bool.type() works the same as type(bool).

Here’s primary production, which defines the ident in the member production.

primary        : dot_ident_arg | dot_ident | ident_arg | ident
               | paren_expr | list_lit | map_lit | literal

The ident is not always transpiled as activation.{name}. Inside member_object, it’s activation.resolve_name({name}). Outside member_object, it can be activation.{name} because it’s a simple variable.

It may make sense to rename the Activation.resolve_name() method to Activation.get().

This, however, overloads the get() method. This has type hint consequences.

Important

The member can be any of a variety of objects:

NameContainer(Dict[str, Referent])
Activation
MapType(Dict[Value, Value])
MessageType(MapType)

All of these classes must define a get() method. The nuance is the NameContainer is also a Python dict and there’s an overload issue between that get() and other get() definitions.

The Transpilation currently leverages a common method named get() for all of these types. This is a Pythonic approach, but, the overload for the NameContainer (a Dict subclass) isn’t quite right: it doesn’t return a Referent, but the value from a Referent.

A slightly smarter approach is to define a get_value(member, 'name') function that uses a match/case structure to do the right thing for each type. The problem is, the result is a union of type, value, function, and any of these four containers!

Another possibility is to leverage the Annotations. They can provide needed type information to discern which method with specific result type.

Architecture and Design¶

Context¶

Container¶

Components¶

Compile-Time¶

Evaluation-Time¶

CEL Types¶

Transpiler Missing Names¶

The Member-Dot Production¶

CEL in Python

Navigation

Related Topics