-
Notifications
You must be signed in to change notification settings - Fork 124
FHIRPath:CQL::XPath:XQuery
I’ve been reviewing the FHIRPath specification as well as the write-up here and I’m impressed with how concisely simple path traversal is expressible; the syntax is well-suited to the use case. However, as is inevitably the case with expression languages, I can easily see FHIRPath being expanded to cover more use cases, and there is already a good deal of overlap between what FHIRPath can express and what CQL can express. This got me wondering whether there is a way that we could combine the functionality to yield a single expression language that would be suitable for both sets of use cases.
I propose that with a few relatively minor changes to the syntax (and, to a lesser extent, the semantics) of CQL, FHIRPath could be defined as a strict subset of CQL, in the same way that XPath is a strict subset of XQuery. Implementations that only need to address the use cases covered by FHIRPath could stick to that subset and implement dynamic scripting evaluation engines for use in frontend applications or server implementation layers. Applications that need access to the full query functionality provided by CQL, can provide complete implementations to make that possible.
CQL would benefit from the enhanced path traversal functionality, and rather than redefining full query capability within FHIRPath, applications that need those capabilities can use the CQL specification, providing access to a broad range of clinically relevant functionality, including:
- Clinical literals
- Complete Temporal logic
- Terminology
- Complete List/Table Algebra
In addition, there would be little if any impact on FHIRPath as it is currently specified. What I’m hoping for is a discussion of the basic approach and whether the community feels it is worth pursuing further by building examples and tooling.
Another advantage of specifying FHIRPath as a subset of CQL is that it reduces the number of competing standards in the space. If FHIRPath and CQL continue down two different paths, implementers who work with both will likely be frustrated by how similar, yet different, they are. On the other hand, if FHIRPath is a subset of CQL, implementers who already know CQL will automatically know FHIRPath — and implementers who already know FHIRPath will be well on their way to learning CQL. This seems a clear opportunity to potentially reduce the unnecessary proliferation of standards.
What follows then is a description of the changes that would be required to CQL to enable the FHIRPath syntax to be defined as a strict subset of CQL.
CQL already provides the ability to define paths of any depth, however, it does not provide the ability to automatically traverse a list-valued path. This behavior could be relaxed in CQL to allow path traversal of lists, and as defined in FHIRPath, this would result in a list of elements.
So the required change is to add the ability to invoke a property on a list, and this is applied as the invocation of that property on each element in the list. If the element does not have that property, or the property has no value, it is excluded from the list.
Note that there is a certain amount of type-safety lost in the implicit conversion of what appears to be a singular invocation to a list invocation. One proposal to address that concern is to introduce an optional syntax (together with a language directive like "strict property resolution") that would require that such traversal be explicitly invoked. For example, given a structured type A with a list-valued property B, a path through B could be expressed as:
A.B[].C
Although CQL provides the ability to construct a list of elements from each attribute of a structured value, doing so requires long-handing the full set of attributes. FHIRPath, on the other hand, allows axis specifications of "*" and "**" to construct a list of elements consisting of all children, and all children recursively. To support this fully within CQL, you would have to introduce either dynamic typing and run-time axis operators, or restrict the invocation to cases where the type was known (i.e. compile-time error to invoke * or ** on a generic type). However, an implementation of FHIRPath that used a dynamic language as the engine would not need this, and a full CQL implementation could support axis invocation as a short-hand for a list selector expansion of the object:
Patient.*
Would be long-handed as:
((List<Object> { Patient.id, Patient.name, ...) Q where Q is not null)
FHIRPath defines a recursive path syntax, for example:
ObservationResult.code*
Should return a list of all elements named code, recursively. However, this seems to be ambiguous in the grammar with the multiplication operator, so this may be an item for further discussion.
FHIRPath allows for polymorphic element access, for example:
Observation.component.value[x]
Returns all elements that start with "value". This syntax would actually clash with CQL, although a potential workaround would be to specify "[x]" as a token in the grammar. This would allow it to be distinguished syntactically, with the only by-product being the somewhat surprising result that you could not invoke an indexer with an identifier named x.
FHIRPath defines a small set of primitive value types, and then defines mapping from the FHIR "primitive" types to these, allowing primitive expressions to be written without the need to invoke value selectors to obtain the actual values. Similarly, CQL defines a set of primitive types and relies on the specific model being used to provide the mapping to those base types, so the same behavior could be provided by a "FHIR" model description for CQL.
In addition, CQL would need a mechanism to implicitly convert collections of a single item to singletons.
FHIRPath defines constants using a "%". This symbol is not used within CQL and could be introduced as a way to define constants if necessary. However, CQL already has the ability to define general-purpose constants, they are just a special case of named expressions, so it is a point of discussion whether this would be required if FHIRPath were a subset of CQL.
FHIRPath defines contexts using a "$". Again, this symbol is not used within the CQL grammar and could be introduced as a way to define contexts if necessary. However, CQL already supports the ability to define arbitrary parameters, so this is another potential point of discussion whether this would be required if FHIRPath were a subset of CQL.
FHIRPath supports the ability to invoke operations using "."-notation. The CQL grammar would have to be generalized slightly to allow "." Invocation everywhere (it is currently only allowed after a single qualifier), but doing so would allow the operators defined in FHIRPath to be declared as functions in CQL. The operations would be short-hand for the equivalent operation. For example:
Patient.where(id = "1")
Would be short-hand for:
Patient P where P.id = '1'
FHIRPath defines all expressions to result in a collection, and further, that missing elements are never included as "nulls" in the resulting collections. Thus, the empty collection is effectively a "missing" attribute. This should be semantically equivalent to null-handling for missing information, with the important caveat that 3VL semantics are not defined for the Boolean-valued operators. It seems to be the case that null-propagation is defined for the other operators (for example, 4 + [] = []), but because Boolean operations always interpret the empty sequence as false, 3VL semantics are not preserved. This could potentially have significant implications for query results, and should be investigated further.
Although there are still some issues that would need to be worked out, I don't think any of these changes present a significant challenge to CQL, and their inclusion expands the usefulness and applicability of CQL. FHIRPath and CQL are already both model-independent, define a very similar conceptual type-space, as well as compatible primitive types, and have largely compatible behavioral semantics. By expanding CQL to allow FHIRPath to be defined as a subset, CQL gains the path traversal functionality, and FHIRPath applications that need the additional functionality of a full query language can be extended into CQL without changing the content of the expressions; any FHIRPath expression would work in a full CQL implementation.