Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parameterized macros #111

Open
gertjanvannoord opened this issue Sep 14, 2012 · 4 comments
Open

parameterized macros #111

gertjanvannoord opened this issue Sep 14, 2012 · 4 comments
Labels

Comments

@gertjanvannoord
Copy link
Member

Consider these macros which will find all words that can occur as the OBJ1 of the verb DRINKEN:

obj1_drinken_lexical = """
( @rel="obj1" and
@word and
../node[@rel="hd" and
@lemma="drinken"]
)"""

obj1_drinken_phrase = """
( @rel="hd" and
../@rel="obj1" and
../../node[@rel="hd" and
@lemma="drinken"]
)"""

obj1_drinken_lexical_nonlocal = """
( (@cat or @word) and
%i% = //node[@rel="obj1" and
../node[@rel="hd" and
@lemma="drinken"]]/%i%
)"""

obj1_drinken_phrase_nonlocal = """
( @rel="hd" and
../%i% = //node[@rel="obj1" and
../node[@rel="hd" and
@lemma="drinken"]]/%i%
)"""

obj1_drinken = """
( %obj1_drinken_lexical%
or %obj1_drinken_phrase%
or %obj1_drinken_lexical_nonlocal%
or %obj1_drinken_phrase_nonlocal%
)
"""

If we want to do the same thing for "eten", we need another page of macros. I want to
be able to say

%dependent("obj1","drinken")%

and then define

dependent(Rel,Head) = """
( dependent_lexical(%Rel%,%Head%)
or dependent_phrase(%Rel%,%Head%)
or ....

etc

"""

@danieldk
Copy link
Member

I think we need to parse to a proper AST to be able to do this. Since playing with and testing ASTs in C++ is a real real drag (really). I did a quick write-up in Haskell of a proposed AST format, and how variable substitution and macro calls could be applied:

https://gist.github.com/3742284

@jelmervdl any comments?

@danieldk
Copy link
Member

BTW, a sample invocation:

% ghci -Wall -XOverloadedStrings test.hs 
*Main> applyMacro testMacros testMacro2 ["20"]
Just [StringChunk "bar",StringChunk "foo",StringChunk "20"]
*Main> callMacro testMacros testMacro2 ["20"]
Just "barfoo20"

@jelmervdl
Copy link
Member

Looks ok to me. I don't really have anything to add.

@danieldk
Copy link
Member

I thought a bit more about this, there are some annoyances in implementing this:

  • We use a finite state automaton for parsing macros currently. Since calls can be embedded, we need counting (e.g. to match parenthesises), which can not be done in an automaton. We can do this in the actions, but that's nasty.
  • So, we basically need a separate lexer and parser. The problem that we have in lexing is that we want to return XPath fragments as different tokens than tokens that are part of our macro processing. In other words, we want to use strings that are not likely to occur in XPath as tokens. This gets ugly quite quickly.

One solution, that simplifies things a bit is to move macro invocations out of the query using string interpolation a la Python. E.g.:

a(x) = """//node[%s and %s]""" % b(x), c(x)

Another solution is string concatenation:

a(x) = """//node[""" + b(x) + """ and """ + c(x) + """]""""

Though, I am still wondering if there is no off-the-shell solution that we can use...

@jelmervdl jelmervdl reopened this Feb 23, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants