Skip to content

Bnfterms In Sarcasm

Dávid Németi edited this page Sep 5, 2013 · 4 revisions

In order to provide domain-grammar bindings, typesafe grammars and automatic unparsing, Sarcasm has some extra bnfterm types in addition to those in Irony (BnfTerm is a common base class for NonTerminal and Terminal in Irony). These bnfterm types should be used if one would like to use Sarcasm's features.

Each Sarcasm-bnfterm has a typesafe and a typeless version as well. The typesafe version are the generic types with the domain type as generic parameters, while the typeless version are non-generic types with a TL suffix. (e.g. BnfiTermRecord<TD> is the typesafe version, BnfiTermRecordTL is the typeless version; BnfiTermRecord is their base class, and should not be used)

For the sake of brevity we only discuss the typesafe versions here.

BnfiTermRecord

One of the most frequently used bnfterms in Sarcasm. Inside the grammar rule of a BnfiTermRecord you can use the powerful domain-grammar bindings (BindTo methods).

Restrictions

The domain type must have a public parameterless constructor, the bound fields should be public, and the bound members should have public getters and public setters.

Type Safety

In the domain-grammar bindings the value type matches the member type (or the value type is derived from the member type), and the declaring type matches the domain type of the bnfterm at the left side of the grammar rule.

Example

var While = new BnfiTermRecord<D.While>();

While.Rule =
    WHILE
    + LEFT_PAREN
    + Expression.BindTo(While, t => t.Condition)
    + RIGHT_PAREN
    + DO
    + Statement.BindTo(While, t => t.Body)
    ;

In this rule we bind Expression bnfterm to D.While.Condition property, and Statement to D.While.Body.

Example 2

var BinaryExpression = new BnfiTermRecord<DE.BinaryExpression>();

BinaryExpression.Rule =
    Expression.BindTo(BinaryExpression, t => t.Term1)
    + BinaryOperator.BindTo(BinaryExpression, t => t.Op)
    + Expression.BindTo(BinaryExpression, t => t.Term2)
    ;

In this rule we bind the first Expression bnfterm to D.BinaryExpression.Term1 property, BinaryOperator to D.BinaryExpression.Op and the second Expression to D.BinaryExpression.Term2.

Example 3

Of course, you can use the "logical or" (|) operator if needed:

var If = new BnfiTermRecord<D.If>();

If.Rule =
    IF
    + LEFT_PAREN
    + Expression.BindTo(If, t => t.Condition)
    + RIGHT_PAREN
    + THEN
    + Statement.BindTo(If, t => t.Body)
    |
    IF
    + LEFT_PAREN
    + Expression.BindTo(If, t => t.Condition)
    + RIGHT_PAREN
    + THEN
    + Statement.BindTo(If, t => t.Body)
    + ELSE
    + Statement.BindTo(If, t => t.ElseBody)
    ;

BnfiTermChoice

This should be used when you just have to "fork" your grammar rule. This is the case e.g. when in your domain you have a common abstract base class and its derived classes, or an enum type with enum values.

Type Safety

The domain type of the bnfterms at the right side of the rule has to match the domain type of the BnfTermChoice, or has to be a descendant of the domain type of BnfTermChoice.

Example

var BinaryOperator = new BnfiTermChoice<DE.BinaryOperator>();

BinaryOperator.Rule = ADD_OP | SUB_OP | MUL_OP | DIV_OP | POW_OP;

Example 2

var Expression = new BnfiTermChoice<DE.Expression>();

Expression.SetRuleOr(
    BinaryExpression,
    UnaryExpression,
    NumberLiteral,
    LEFT_PAREN + Expression + RIGHT_PAREN
    );

Note, that due to C# language rules, unfortunately you cannot use the "logical or" operator in the typesafe binding case, so you have to use the SetRuleOr method.

BnfiTermCollection

This should be used when you have to deal with a list of items.

Restrictions

The domain type (the type of the collection) must have a public parameterless constructor, and must implement ICollection<T> interface, where T is the domain type of the items in the collection. There is no restrictions for the domain type of the items.

Type Safety

It transforms typesafe items into a typesafe collection.

Example

var Statements = new BnfiTermCollection<List<D.Statement>, D.Statement>();

Statements.Rule = Statement.PlusList();   // one or more statements
Statements.Rule = Statement.StarList();   // zero or more statements

Example 2

var Arguments = new BnfiTermCollection<List<D.Argument>, D.Argument>();

Arguments.Rule = Argument.PlusList(COMMA);   // one or more arguments separated by commas
Arguments.Rule = Argument.StarList(COMMA);   // zero or more arguments separated by commas

Example 3

Most of the time you might not want to define a separate BnfiTermCollection variable. If you don't need it, you don't have to, so you can handle collections in a more concise way:

FunctionCall.Rule =
    FunctionReference.BindTo(FunctionCall, t => t.FunctionReference)
    + LEFT_PAREN
    + Argument.StarList(COMMA).BindTo(FunctionCall, t => t.Arguments)
    + RIGHT_PAREN
    ;

Note, that we didn't have to deal with domain types here, StarList/PlusList methods take care of it all: the domain type of Argument is D.Argument, so we have it already, and StarList/PlusList methods create bnfterm with type BnfiTermCollection<List<TItem>, TItem> by default if generic type parameters are not specified.

BnfiTermKeyTerm

This just a simple keyterm terminal type derived from Irony's KeyTerm. (It is needed to be used instead of Irony's KeyTerm due to the implementation of Sarcasm's type system.)

Example

BnfiTermKeyTerm __ADD_OP = ToTerm("+");

BnfiTermConversion

BnfTermConversion is used in the following situations:

  • when you have to introduce a value for a Sarcasm-keyterm
  • when you have to introduce an Irony-bnfterm (typically a terminal (literal, identifier)) into Sarcasm's world
  • when you have to convert a Sarcasm-bnfterm into another Sarcasm-bnfterm, which cannot be solved by using any other bnfterm types.

Type Safety

When introducing a typeless bnfterm into Sarcasm's world as a typesafe bnfterm with domain type T, the conversion function converts from object to T. When converting a typesafe bnfterm with domain type T1 to a typesafe bnfterm with domain type T2, the conversion function converts from T1 to T2.

Example

Introduce a specific domain value for a specific keyterm:

BnfiTermKeyTerm __ADD_OP = ToTerm("+");
BnfiTermConversion<DE.BinaryOperator> ADD_OP = __ADD_OP.IntroValue(DE.BinaryOperator.Add);

Of course, you do not have to define a separate variable for the actual keyterm, so you can write this:

BnfiTermConversion<DE.BinaryOperator> ADD_OP = ToTerm("+").IntroValue(DE.BinaryOperator.Add);

There is a shortcut for this, so you can write this too:

BnfiTermConversion<DE.BinaryOperator> ADD_OP = TerminalFactoryS.CreateKeyTerm("+", DE.BinaryOperator.Add);

Example 2

Introduce an Irony-terminal into Sarcasm's world:

BnfiTermConversion<string> IDENTIFIER = new IdentifierTerminal(name).IntroValue<string>(
    (context, parseNode) => (string)parseNode.FindToken().Value,
    IdentityFunction,
    astForChild: false
    );

There is a shortcut for this, so you can write this too:

BnfiTermConversion<string> IDENTIFIER = new IdentifierTerminal(name).IntroIdentifier();

Or this:

BnfiTermConversion<string> IDENTIFIER = TerminalFactoryS.CreateIdentifier();

Example 3

Convert a Sarcasm-bnfterm into another Sarcasm-bnfterm:

var NameRef = new BnfiTermConversion<NameRef>();

NameRef.Rule = IDENTIFIER.ConvertValue(_identifier => new NameRef(_identifier), _nameRef => _nameRef.Value);

We have to use BnfiTermConversion here instead of BnfiTermRecord, because NameRef does not have a public parameterless constructor. Note, that we also specified the inverse conversion for the unparser. If you do not want to specify the inverse conversion, you have to write this:

NameRef.Rule = IDENTIFIER.ConvertValue(_identifier => new NameRef(_identifier), NoUnparseByInverse<NameRef, string>());

Example 4

Convert a Sarcasm-bnfterm into another Sarcasm-bnfterm:

var NamespaceName = new BnfiTermConversion<NameRef>();

NamespaceName.Rule =
    IDENTIFIER
    .PlusList(DOT)
    .ConvertValue(
        _identifiers => new NameRef(string.Join(DOT.Text, _identifiers)),
        _nameRef => _nameRef.Value.Split(new string[] { DOT.Text }, StringSplitOptions.None)
    );

We have to use BnfiTermConversion here instead of BnfiTermRecord, because in NameRef we do not want to store a list of identifiers (list of strings), but we want to store it in a single string with dot characters as separators. Note, that we also specified the inverse conversion for the unparser. If you do not want to specify the inverse conversion, you have to write this:

NamespaceName.Rule =
    IDENTIFIER
    .PlusList(DOT)
    .ConvertValue(
        _identifiers => new NameRef(string.Join(DOT.Text, _identifiers)),
        NoUnparseByInverse<NameRef, IEnumerable<string>>()
    );

Or you can omit the unparse-by-inverse parameter:

NamespaceName.Rule =
    IDENTIFIER
    .PlusList(DOT)
    .ConvertValue(_identifiers => new NameRef(string.Join(DOT.Text, _identifiers)));

but in this case you will get a warning by the compiler, unless you disable it with a #pragma (it is the best to put this to the beginning of your grammar, once):

#pragma warning disable 618

QRef and QVal

These are not types, but methods with BnfTermConversion as return type.

QRef and QVal (for reference domain type and for value domain type, respectively) in Sarcasm are the equivalent methods for the Q method in Irony. Q means that its argument bnfterm is optional (the same as the ? operator in regular expressions).

Example

If you want to be more concise when defining the grammar rule for your if you can write:

var If = new BnfiTermRecord<D.If>();

If.Rule =
    IF
    + LEFT_PAREN
    + Expression.BindTo(If, t => t.Condition)
    + RIGHT_PAREN
    + THEN
    + Statement.BindTo(If, t => t.Body)
    + (ELSE + Statement).QRef().BindTo(If, t => t.ElseBody)
    ;

BnfiTermConstant

This is the equivalent of Irony's ConstantTerminal. This represents a set of constants as a set of (text,value) pairs.

Type Safety

The type of the constant values should match the domain type of the BnfTermConstant.

Example

var BOOL_CONSTANT = new BnfiTermConstant<bool>()
{
    { "True", true },
    { "False", false }
};

BnfiTermCopy

It has two roles. The first role is when you just want to use an extra bnfterm in your grammar just for the sake of abstraction, or to discriminate it from the original bnfterm because you want to refer to it in the unparser's formatter. The second role is when you want to copy an already bound BnfiTermRecord with domain type TBase into a BnfiTermRecord with domain type TDescendant, where TBase is a base class for TDescendant.

Example

First role:

var Key = new BnfiTermCopy<string>();

Key.Rule = STRING;

Example 2

Second role: let's say that instead of this if representation in the domain:

public class If : Statement
{
    public DE.Expression Condition { get; set; }
    public Statement Body { get; set; }
    [Optional]
    public Statement ElseBody { get; set; }
}

we have this one:

public class If : Statement
{
    public DE.Expression Condition { get; set; }
    public Statement Body { get; set; }
}

public class IfElse : If
{
    public Statement ElseBody { get; set; }
}

Then instead of this grammar rule:

var If = new BnfiTermRecord<D.If>();

If.Rule =
    IF
    + LEFT_PAREN
    + Expression.BindTo(If, t => t.Condition)
    + RIGHT_PAREN
    + THEN
    + Statement.BindTo(If, t => t.Body)
    + (ELSE + Statement).QRef().BindTo(If, t => t.ElseBody)
    ;

we have these ones:

var If = new BnfiTermRecord<D.If>();
var IfElse = new BnfiTermRecord<D.IfElse>();

If.Rule =
    IF
    + LEFT_PAREN
    + Expression.BindTo(If, t => t.Condition)
    + RIGHT_PAREN
    + THEN
    + Statement.BindTo(If, t => t.Body)
    ;

IfElse.Rule =
    If.Copy(IfElse)
    + ELSE
    + Statement.BindTo(IfElse, t => t.ElseBody)
    ;

BnfiTermNoAst

It is used when we want to use a bnfterm in a grammar rule, but we want to produce no AST for it. To tell Sarcasm's type system that it is okay to use it, you should wrap it in a BnfiTermNoAst.

Note, that usually you do not need to use this type of bnfterm.

Example

If, for some reason, you cannot use Sarcasm's BnfiTermKeyTerm:

var Return = new BnfiTermRecord<D.Return>();
BnfiTermKeyTerm RETURN = TerminalFactoryS.CreateKeyTerm("return");

Return.Rule =
    RETURN
    + Expression.BindTo(Return, t => t.Value)
    ;

but you have a Irony's KeyTerm instead, then you can write this:

var Return = new BnfiTermRecord<D.Return>();
KeyTerm RETURN = ToTerm("return");

Return.Rule =
    RETURN.NoAst()
    + Expression.BindTo(Return, t => t.Value)
    ;

Example 2

If you want to throw away the AST of some bnfterms, and you just need the bnfterms to be there in your grammar rule for proper parsing, then you can write this:

var Return = new BnfiTermRecord<D.Return>();
BnfiTermKeyTerm RETURN = TerminalFactoryS.CreateKeyTerm("return");

Return.Rule =
    RETURN
    + Expression.BindTo(Return, t => t.Value)
    + Expression.NoAst(() => new DE.NumberLiteral(3.14))    // parse an expression but abandon it
    ;

You can parse e.g. "return 1 2", and the resulting AST will be only D.Return(DE.NumberLiteral(value: 1)), while expression 2 will be abandoned. After unparsing D.Return(DE.NumberLiteral(value: 1)) you will get the string "return 1 3.14".

Note, that due to the fact that there is no AST for the parsed abandoned expression, you need to provide some kind of "default value" for the unparser. Here, it is a number literal with the value 3.14.

If you do not want to deal with the unparser, you can write this:

var Return = new BnfiTermRecord<D.Return>();
BnfiTermKeyTerm RETURN = TerminalFactoryS.CreateKeyTerm("return");

Return.Rule =
    RETURN
    + Expression.BindTo(Return, t => t.Value)
    + Expression.NoAst(NoUnparseByInverseCreatorFromNoAst<DE.Expression>())     // parse an expression but abandon it
    ;

Or you can omit the unparse-by-inverse parameter:

Return.Rule =
    RETURN
    + Expression.BindTo(Return, t => t.Value)
    + Expression.NoAst()     // parse an expression but abandon it
    ;

but in this case you will get a warning by the compiler, unless you disable it with a #pragma:

#pragma warning disable 618

We have finished the discussion of the grammar and the AST building. You can find a complete, working MiniPL example in the MiniPL project. You can find there a MiniPL domain, and two grammars for it: GrammarC and GrammarP, which are a C-like and a Pascal-like grammars for MiniPL domain, respectively.

Another important topic is how to handle references. If you are interested, continue with Reference Handling.