Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment with getting @!group directive working for C parser. #1238

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

thomthom
Copy link
Contributor

@thomthom thomthom commented Apr 9, 2019

Description

This is experimenting into allowing @!group directive tags to be picked up by CParser. (#1237) This is a kludgy approach - but I wanted to get the conversation going as I'm not sure exactly where the correct fix should be applied.

From debugging it appear that CParser ignore all comments withing consume_body_statements - this prevents the directives from being activated for the scope of the function body that defines the class/method along with it's methods and constants.

In my experimental exploring I found that I could wrangle it to pick up the directives by capturing any comments starting with @! and assuming they are directive tags. Doing so ensured the group got applied to the objects within YARD::Registry.

The rest of the test suite also passes - but as you can see, I'm really forcing things here. If you have time to provide some insights to how this can be more appropriately addressed I'd appreciate it.

I stepped through both the C parser and Ruby parser comparing their logic - but I struggled to find a common pattern for the parsing that I could apply to backport to the CParser. The main thing I noticed was that the Ruby parser picked up the directive comments and registered them via register_docstring before the constants were processed. That's what this experiment applies.

Completed Tasks

  • I have read the Contributing Guide.
  • The pull request is complete (implemented / written).
  • Git commits have been cleaned up (squash WIP / revert commits).
  • I wrote tests and ran bundle exec rake locally (if code is attached to PR).

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.05%) to 93.693% when pulling b02a7e9 on thomthom:dev-c-directives into 12f56cf on lsegal:master.

@coveralls
Copy link

coveralls commented Apr 10, 2019

Coverage Status

Coverage decreased (-1.2%) to 92.615% when pulling 7bb63be on thomthom:dev-c-directives into 890d4ad on lsegal:master.

@thomthom
Copy link
Contributor Author

@lsegal - any chance you are able to shed some light on the differences between the Ruby and the C parser? Something that let me avoid the kludge I currently got?

@lsegal
Copy link
Owner

lsegal commented Jul 29, 2019

@thomthom I think the implementation seems kludgey because adding to the OverrideCommentHandler is mixing multiple concerns together. If you had a separate handler just for directives it would probably look a lot simpler:

class YARD::Handlers::C::DirectiveParserHandler < YARD::Handlers::C::Base
  handles(/./)
  statement_class Comment

  process do
    Docstring.parser.parse(statement.source, namespace, self)
  end
end

I believe this works.

@thomthom
Copy link
Contributor Author

thomthom commented Sep 9, 2019

I had another look at this again. Reverted my hacks, kept the spec. What I'm seeing is that the DirectiveParserHandler is executed after YARD::Handlers::C::ConstantHandler so that the group directive have yet to be attached to the extra_data.group.

I added similar spec to the Ruby parser, trying to see how they behave differently, but so far failing.

Any suggestions to how to ensure the @!group directive is processed before the constant handler?
(My latest experiment is here: https://github.com/thomthom/yard/tree/dev-c-directices-handler)

@thomthom
Copy link
Contributor Author

thomthom commented Sep 9, 2019

Can this relate to differences in how Ruby and C is parsed? Ruby appear to traverse via the AST tree, where as C appear to be parsed by regex extraction, which makes things not to be processes in the order they appear in the source?

@thomthom
Copy link
Contributor Author

Here are my observations:

SourceParser#parse
  ...
  @parser.parse
  post_process

After @parser.parse is done @parser.enumerator (@statements) contains six items; one TopLevelStatement and five Comments. The TopLevelStatement contains four BodyStatement.

@parser.enumerator: (statements)
  YARD::Parser::C::ToplevelStatement
    @block:
      YARD::Parser::C::BodyStatement - rb_cExample = rb_define_class("Example", rb_cObject);
      YARD::Parser::C::BodyStatement - rb_define_const(rb_cExample, "FOOBAR", INT2NUM(1));
      YARD::Parser::C::BodyStatement - rb_define_const(rb_cExample, "FOOBIZ", INT2NUM(2));
      YARD::Parser::C::BodyStatement - rb_define_const(rb_cExample, "HELLO", INT2NUM(3));
  YARD::Parser::C::Comment - @!group Foos
  YARD::Parser::C::Comment - 1: Foobar description.
  YARD::Parser::C::Comment - 2: Foobiz description.
  YARD::Parser::C::Comment - @!endgroup
  YARD::Parser::C::Comment - 3: Hello description.

Even though the comments are in the same scope as the body statements the parser extracts them such that they comments end up in the outer "scope". It looks like the logic is that the comments are bounded in post processing.

But that means the @!group directive isn't being applied correctly.

With my kludge in consume_body_statements where I capture the directive it ends up looking like this:

@parser.enumerator: (statements)
  YARD::Parser::C::ToplevelStatement
    @block:
      YARD::Parser::C::BodyStatement - rb_cExample = rb_define_class("Example", rb_cObject);
      YARD::Parser::C::Comment - @!group Foos
      YARD::Parser::C::BodyStatement - rb_define_const(rb_cExample, "FOOBAR", INT2NUM(1));
      YARD::Parser::C::BodyStatement - rb_define_const(rb_cExample, "FOOBIZ", INT2NUM(2));
      YARD::Parser::C::Comment - @!endgroup
      YARD::Parser::C::BodyStatement - rb_define_const(rb_cExample, "HELLO", INT2NUM(3));
  YARD::Parser::C::Comment - @!group Foos
  YARD::Parser::C::Comment - 1: Foobar description.
  YARD::Parser::C::Comment - 2: Foobiz description.
  YARD::Parser::C::Comment - @!endgroup
  YARD::Parser::C::Comment - 3: Hello description.

Getting the directives processed within the same scope as the body statements appear to be important for this state to be turned on and off properly.

Trying to figure out a way that is less hacky than my proof of concept. But it seems it require some more fundamental change to the C parser. Unless I'm missing some other mechanism I ca use?

@thomthom
Copy link
Contributor Author

I haven't been able to figure out a solution that avoids copying the @!group back into the ToplevelStatement node. To avoid that the C parser might have to change significantly.

Do you have any further suggestions I can try?

@thomthom
Copy link
Contributor Author

thomthom commented Jun 3, 2020

@lsegal - any chance you have time to consider questions in earlier comment? #1238 (comment)

The current way C source is parsed and information stores doesn't seem to fit with the actual scope of the sources.

@lsegal lsegal closed this Jun 20, 2020
@flavorjones
Copy link

@lsegal It looks like the rename of the primary branch to main has auto-closed this and nine other PRs that were based off of primary branch under the previous name.

@lsegal
Copy link
Owner

lsegal commented Jun 20, 2020

Oh good catch. Let's reopen them

@lsegal lsegal reopened this Jun 20, 2020
@lsegal lsegal changed the base branch from master to main June 20, 2020 18:12
@thomthom
Copy link
Contributor Author

I'm having another look at this. @lsegal do you have a spare moment to look at my earlier question in this comment: #1238 (comment) ? I'd help me in what direction to attempt next.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants