Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new RFC: short error identifiers #33

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

gasche
Copy link
Member

@gasche gasche commented Sep 22, 2022

This RFC is more about a tooling change than a language change, but the RFC mechanism of agreeing on a spec before implementing feels appropriate here as well.

The idea is to include a stable error identifier in all error messages generated by the OCaml compiler, to make it easier to look online for help on this error for example. (We learned of this idea from the Rust compiler, rustc)

Rendered version of the RFC: https://github.com/gasche/RFCs/blob/error-identifiers/rfcs/error-identifiers.md

Current behavior:

# let x = 2 + "foo";;
Error: This expression has type string but an expression was expected of type
         int

Proposed behavior:

# let x = 2 + "foo";;
Error[tyco007]: This expression has type string but an expression
  was expected of type int

Co-Authored-By: Florian Angeletti <[email protected]>
@mimoo
Copy link

mimoo commented Sep 22, 2022

I like it. I like that Rust also gives you a link that you can directly click on to get more information.

(Why the group prefix then? Actually it simplifies the implementation
to give per-module identifiers instead of trying to pick global
numbers and avoiding conflicts between two errors trying to use the
same number. It might also help users identify classes of errors.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really convinced by the format nor the rationale.

I think the fact that it's an abbreviation that maps to the module is likely to be interesting only to compiler devs. And the day your code moves to another module what do you do with the uid ? Is it that complicated to have a single module Error_id where you define errors and which you draw from ?

Why not keep it to EXXX/WXXX, short and actually tells the user whether that's an error or a warning.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module name in the example was mostly a proxy for the class of errors. For instance, extending the examples

  • synt : syntax errors
  • tyco : core language type errors
  • tymo : module languages
  • tymi module inclusion errors
  • tyex : type expression errors
  • link: linking error
  • cpmo : module compilation error
    If we ever change the location/classification of an error, it seems fine to change the identifiers of the errors.

One advantage that I see with those errors is that thematically adjacent errors will share similar identifiers even after many renaming and moves (which doesn't happen that often). Contrarily, with dense set of numerical identifiers, a single splits of errors might move two sibling errors far away from each other.
However, I kind of agree that the "themes" themselves might be only partially understandable at first glance for non-compiler developers, but I think that the similarity of the errors will make sense for users.

Defining all error datatypes in a single module does not really work that well with the current architecture: error constructors often carry complex payload that are strongly tied to the concepts of the modules that raise them.
We could define the identifier part centrally, and dispatch the error constructors themselves to those identifiers, but that creates two distinct source of truths and probably make a coherency tool a little more mandatory.

Overall, we could definitively have a single numerical identifiers, but the idea of having group of errors seemed slightly attractive both in term of user experience and compiler development. But it could be totally be than my vision of user experience is far too biased in that specific instances.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My 2 cents

synt : syntax errors
tyco : core language type errors
tymo : module languages
tymi : module inclusion errors
tyex : type expression errors
link : linking error
cpmo : module compilation error

This map of knowledge only makes sense when you look at it entirely. When those abbreviations might appear on your console, out of context it wouldn't give any valuable information.

For example when "tyex" appears on your errors for the first time until the N time and realise it has some meaning (and there are other codes that mean other errors) that's already late since you would understand the map as a hole and context isn't needed anymore.

I like the proposal, it adds searchability to errors and given the extended places to ask/receive help online it's huge. I would either make it indexable by a number or give the entire context in the error. Instead of synt -> SYNTAX_ERROR_X

PS: Sorry If I jumped into an issue where no one called me out

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davesnx this is a reasonable point, but we also want to balance with the size of the identifier which should not become too long, because it is less interesting to humans than the actual human-readable explanation of the error, and still displayed first. We could decide to move the error identifier at the end of the error message, which would un-constrain this aspect of the design space.

@dbuenzli my gut feeling is that having per-class numbers, instead of whole-compiler numbers, is going to make our life easier. For example, it is going to greatly decrease the number of situations where two independent compiler changes conflict (logically and/or textually) because they want to introduce the same error number.

@nojb
Copy link

nojb commented Sep 23, 2022

Why not keep it to EXXX/WXXX, short and actually tells the user whether that's an error or a warning.

This was also my first thought. Nice and simple.

Rust also gives you a link that you can directly click on to get more information.

This also sounds like a good idea (though it is orthogonal to the issue of error numbers).

@Octachron
Copy link
Member

Having links to examples and explanations in the manual is part of the longer plan. In particular, it would be easier to write examples and explanations for error messages once we are able to explicitly refer to an error.

@dbuenzli
Copy link
Contributor

I don't mind having a short code for classifications of errors it can certainly help to give context. But one that makes senses to end users (which I would roughly say is: syntax, type, link) it seems to me that it could fit in one or two chars e.g. EsXXX, EtXXX, ElXXX

What I don't understand is that you seem ok with changing the error code. That defeats the whole idea of having stable unique identifiers for errors. Once used an error code should never change and never be reused.

Also to devise the scheme I think it would be useful to give an idea of the number of codes we are actually talking about.

@yallop
Copy link
Member

yallop commented Sep 23, 2022

I like the idea, but I think it'd be better with mnemonic names, such as EExpectedType rather than tyco007.

We recently (ocaml/ocaml#9657) added the ability to write -w +unused-value-declaration rather than -w +32 for warning flags; it'd be good to have a similarly human-friendly design for error codes from the outset.

@dbuenzli
Copy link
Contributor

I like @yallop's suggestion. Mnemonic names allow for quick error characterization without having to parse the whole error message or maintain arbitrary numbers in one's brain. It certainly makes it easier for people helping other people (including oneself :-).

If that is done, I would just suggest to keep the warning naming convention rather than add a new one, i.e. (expected-type, rather than EExpectedType).

@xavierleroy
Copy link

I am reminded of IBM's xlc compiler, which adds codes like CCN3243 on every error and warning. Then there's a 388-page PDF document listing all codes with additional explanations: https://www-40.ibm.com/servers/resourcelink/svc00100.nsf/pages/zOSV2R3gc147305/$file/cbcdg01_v2r3.pdf . These are not good memories... it feels like being in the movie Brazil .

What's wrong with just googling the full textual error message? It's not like we change them often.

@xavierleroy
Copy link

I have another idea! If the error codes are 256-bit wide, they could be addresses of smart contracts in a friendly blockchain. Each contract would, then, print the detailed explanation of the error. For a fee, of course. Voilà! self-financing for the OCaml project!

@Octachron
Copy link
Member

@dbuenzli The set of errors might evolve, for instance one error (Tyco001) might be split into two more specific errors (Tyco015 and Tyco016). In this case, the old error identifier should not be reused, and its documentation might point to the two new suberrors. But yes, I agree that the meaning of attributed identifiers should be stable.

@xavierleroy Identifying the skeleton part of the error message one should google is not completely straightforward. Even when using git grep on the compiler source, it happens that I have to try few guesses before locating an error constructor from an testsuite failure for instance.

If we got toward a human-readable identifier, it will gives user a clear title for their error messages and avoid the dystopian feeling.

@dbuenzli
Copy link
Contributor

What's wrong with just googling the full textual error message?

This doesn't work so well because error messages are context dependent (also somehow google search with arbitrary text is much less effective than it used to be). This means you have to trim them of your specifics to search them. A well defined token makes the whole process easier.

@gasche
Copy link
Member Author

gasche commented Sep 23, 2022 via email

@Octachron
Copy link
Member

For another data point, it seems that Haskell is also moving in this direction: https://errors.haskell.org .

@goldfirere
Copy link
Contributor

I've been heavily involved with the Haskell errors process. I can share a few notes from our thoughts. (These are more our thoughts than our experiences, as it's all too fresh to have useful user feedback, say.)

  • We settled on opaque numbers rather than natural-language identifiers, though there was some debate. Here are some of the reasons why:

    • Error numbers are shorter.
    • Error numbers are easier to detect in a string. (All our error numbers start with a GHC prefix.) This allows tooling to, detect the error code and render it as a link, say.
    • If an error name indicates that it's, say, a type-checker error, that fact may change in a future release. For example, GHC has name-resolution pass separate from type-checking. An out-of-scope variable used to be a name-resolution error, but it since became a type-checker error (because this allows GHC to provide type information with the out-of-scope error). If we indicated the component of the compiler in the error code, either the code would have had to change or become wrong.
    • We allocate our identifiers at random, not sequentially. This provides a level of abstraction over when the error was introduced and what component it came from. It would be awkward, in two years, to have errors 01-87, and then 345, 389, and 402 all come from the type-checker. Users would know that those last errors are somehow different than the earlier ones.

    As I said above, there was some debate within Haskell about this decision. While I agree with where Haskell settled here, I think it's quite reasonable to go in the other direction.

  • We wanted to expand this facility beyond just GHC, and so we prefix all our error codes with "GHC", in the hopes that other tools can join the fun with other prefixes. The Haskell Foundation is tasked with controlling the prefix namespace, but this is a very light duty. It's just that some group needs to be in charge of the namespace.

  • One explicit goal of the prefix is to make error messages identify their source. This is sometimes hard, today. For example, is an error message coming from the ocaml compiler? or from opam? or from dune? or from merlin? I don't yet have enough experience with OCaml to know whether this is a problem, but it is a real problem in the Haskell world.

  • All of the infrastructure behind the website is public (https://github.com/haskellfoundation/error-message-index). Sharing efforts here between our communities seems like an easy win.

@nyambura00
Copy link

After a little wandering, #Outreachy contribution phase led me here.

I just got exposed to Ocaml and I certainly love its speed and expressiveness. I've played around with the basic features and certainly, there is a need for simpler Error Identification.

I think Rust's error identification way is pretty much impressive(one of the major reasons I stuck around), something that we could perhaps borrow if the compiler allows.

An identification structure that could perhaps entail the severity level enum(Error/Warning/Help...etc), code[prefix+int], a concise message clause, and perhaps a diagnostic window showcasing relevant primary and secondary spans.

@mobileink
Copy link

You could do worse than follow the example of IBM - the mainframe stuff, not the c compilers.

reason codes

messages and codes

z/OS MVS System Messages

@gasche
Copy link
Member Author

gasche commented Dec 23, 2022

For the record: @Octachron and myself intended this RFC to be the basis for an Outreachy internship, but we ended up without an intern to work on this. If someone is interested in contributing in this area, feel free to let us know -- no one is actively working on it but @Octachron is still interested in supervising.

@gasche
Copy link
Member Author

gasche commented Feb 11, 2023

People have (rightly) suggested that the error identifier could/should be a clickable link going to a webpage with documentation about the error. I wasn't sure how to include clickable links in terminal output. The following page has good information on this:

https://gist.github.com/egmontkob/eb114294efbcd5adb1944c9f3cb5feda

(This is related to work on structured error output: many people consume error messages not through a terminal but rather through Merlin/LSP and their editor integration, and for those we probably need structured errors to enable proper hyperlinking.)

@yawaramin
Copy link

That gist is an interesting overview but perhaps we are overthinking things here? As far as I can tell nowadays most terminals support either clicking or right-clicking on plain links that start with https:// or other schemes to open them in the default browser.

@dra27
Copy link
Member

dra27 commented Feb 21, 2023

There is a noise to displaying URLs all the time, though - the error identifier is the information, not the URL to find out more. I played around with this very briefly in opam ages ago (see ocaml/opam#4568), but it wasn't as compelling there because we do also want to display the URL in opam show.

@yawaramin
Copy link

yawaramin commented Feb 21, 2023

To me, the 'URL to find out more' is the most important part of the workflow when you show the user an error message and short identifier. You just got an error (or otherwise) message and an ID, what are you going to do with it? You can Google it, if it has enough SEO juice you might get lucky and find it. Or you can search for it on the OCaml Discuss, or website and again rely on the vagaries of search engines.

If we want to make the 'find out more' part less frustrating and more reliable, we have to think about the pages that actually provide the detailed information about the error IDs. What are their URLs? What are their contents? How are they maintained? How do we ensure people can find them easily?

The last question is what I am trying to answer with e.g. ocaml/ocaml.org#916 . If we directly print the hyperlinks, then it's super easy to find out more. E.g.,

utop # 1+true;;
Error: This expression has type bool but an expression was expected of type int
🔗 https://ocaml.org/t/tyco007

Which would just 301 redirect to the actual page where the error message is explained.

@mobileink
Copy link

mobileink commented Feb 21, 2023

To me, the 'URL to find out more' is the most important part of the workflow when you show the user an error message and short identifier.

Who is going to maintain those links? I think it is virtually guaranteed that they will break and then you end up with an even worse situation, dangling URLs.

The compiler should not depend on external resources it does not control. Using unique identifiers (and a consistent formally defined syntax for messages) offloads the task of illuminating the error meaning to third-party tools, which is where it belongs, IMO. In other words, the compiler should not be in the business of designing workflows; rather it should provide the resources that tooling can use to design them.

@Octachron
Copy link
Member

In term of dangling pointers, an interesting data point is that some error messages already point towards sections of the manual.

However, since the manual is part of the compiler source tree, the consistency of those links are already checked in the compiler CI. This consistency check seems to empirically work since I didn't have to remind people to update those links the last time new sections were added to the manual.

I could potentially extend this test to cover the error IDs if the error explanation were in the manual. At the same time, if the error message explanations are in a sufficiently discoverable section the manual, do we still need URL links?

@yawaramin
Copy link

@mobileink yes these are all great questions. How to ensure that links to pages which explain the error IDs continue to work in the future? I understand that historically the compiler hasn't done this and has left it up to tooling creators. I am saying that if we treat it as a holistic issue of user experience, it would be a big win for users if error messages pointed to instantly accessible pages for detailed information.

@Octachron pointing to links published from sources in the compiler source tree is a fine relatively safe alternative, in my opinion. E.g.

utop # 1+true;;
Error: This expression has type bool but an expression was expected of type int
🔗 https://v2.ocaml.org/releases/5.0/htmlman/coreexamples.html#s:basics

It's not as short, but the concision is not the point, the ease of access is. The most important point is that the links not break, and for that some coordination would be required with the ocaml.org repo.

If links are totally unacceptable, then at least we can point to specific sections of the manual using some recognizable citation format. We would probably need to add appendices to the manual for error messages, warnings, etc.

@davesnx
Copy link

davesnx commented Feb 21, 2023

I have seen 2 big projects using this approach from the Frontend/JavaScript-world. React and Next.js. They have the same technique: have a shortener link that points to a URL when an error/warning happens. Maintaining a list of errors that points to a website isn't that big of a deal, making sure this list is append-only and any deletion must add a redirect. Might be helpful to have a shortener that allows this.

To add more weight to this, it's very useful not only to expand more information, but rather than share the error publicly while helping others or asking for help.

@gasche
Copy link
Member Author

gasche commented Feb 21, 2023

My impression is that "showing the error code, with the appropriate escape codes to have a clickable link" and "showing a URL" are two reasonable approaches with pros and cons, that are generally okay. From the point of view of users and projects, they are in fact rather close. (The URL idea adds slightly more convenience and a new piece to maintain.) Maybe we could postpone the final discussion on that particular aspect of the design until we have had more discussions on the rest and, in particular, some visibility on who may implement the RFC?

@yawaramin
Copy link

Sure. If we postpone the URL/clickable link idea, then to my eyes there are two main points left:

  1. What messages will be assigned IDs and what will the IDs be
  2. Where will the detailed info corresponding to each ID be published

Of course this also leaves all the implementation details and actual people to be decided.

@mobileink
Copy link

mobileink commented Feb 21, 2023 via email

@yawaramin
Copy link

Could you give an example of what you mean by 'formal syntax of messages'? E.g. the PR description shows a possible message:

# let x = 2 + "foo";;
Error[tyco007]: This expression has type string but an expression
  was expected of type int

Are you thinking of a (BNF?) syntax that just formalizes the above message, or something different?

@smorimoto
Copy link
Member

I honestly don't like methods that rely on search engines, or anything that is useless in an offline environment. So have you thought about an offline-first, online-ready hybrid reference by using the concepts that have been worked well in odoc? I even thought we could take advantage of the offline and work with LSP and others to do something interesting fast. However, this is just an idea.

@mobileink
Copy link

Could you give an example of what you mean by 'formal syntax of messages'? E.g. the PR description shows a possible message:

# let x = 2 + "foo";;
Error[tyco007]: This expression has type string but an expression
  was expected of type int

Are you thinking of a (BNF?) syntax that just formalizes the above message, or something different?

I don't have anything specific in mind. It just has to be formally specified, to ensure that all messages are easily and unambiguously parseable, so tools don't have to guess.

@rgrinberg
Copy link
Member

Apologies if this has been mentioned, I haven't read the entire theread. If the manual is installed as manpages along with the compiler, what about just printing a man command that will jump to the error in the manual?

@yawaramin
Copy link

yawaramin commented May 20, 2023

That's not a bad idea but:

  • Will man pages work in a Windows installation of OCaml? I don't think so unless we are using WSL?
  • Can we point to specific sections inside a single man page? Maybe if we organize e.g. each separate error message into a separate submodule so e.g. man Errors.Type_error. But this seems pretty clunky.

EDIT: on second thought, it seems kinda clever:

(* stdlib/errors.mli *)

module Type_error : sig end
(** Type mismatch error,... *)

module Syntax_error : sig end
(** Usually because an OCaml keyword was used as an identifier,... *)

We could get the compiler to check the error identifiers are unique. And we could just tell people to look up the documentation for Errors.Bla_bla in whatever form they prefer.

E.g.

# let x = 2 + "foo";;
Stdlib.Errors.Type_error: This expression has type string but an expression
  was expected of type int

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.