Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse File ... not found messages #297

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

9999years
Copy link
Member

@9999years 9999years commented Jun 17, 2024

This will help us keep the module set in sync more reliably.

Depends on:

@9999years 9999years added the patch Bug fixes or non-functional changes label Jun 17, 2024
Copy link

linear bot commented Jun 17, 2024

@9999years 9999years changed the base branch from main to rebeccat/refactor-module-set June 21, 2024 20:14
@9999years 9999years force-pushed the rebeccat/dux-2341-parse-module-not-found-messages branch from 7a91ede to 2bb67b8 Compare June 21, 2024 20:15
9999years added a commit that referenced this pull request Jun 21, 2024
This is useful for line-oriented parsers that may consume input with no
trailing newline character. While this is a [violation of the POSIX
spec][posix], VS Code [does it by default][vscode].

[posix]: https://stackoverflow.com/a/729795
[vscode]:
https://stackoverflow.com/questions/44704968/visual-studio-code-insert-newline-at-the-end-of-files

Split off from #297
9999years added a commit that referenced this pull request Jun 21, 2024
Previously, a `ModuleSet` wrapped a `HashMap<NormalPath, TargetKind>`.
This had a number of undesirable consequences:
* The data about a module's name (as its path) and how it was loaded (as
  its `TargetKind`) were split from each other and difficult to
  reference.
* The module's import name wasn't explicitly stored anywhere, so we
  needed to convert between paths and dotted names when those were
  needed, which required hitting the disk.
* There wasn't a type for the module's import name, so when we (e.g.)
  `:unadd`ed modules we needed to format them as strings.

Now, a `ModuleSet` wraps a `HashSet<LoadedModule>`.

* A `LoadedModule` wraps a path but optionally contains the module's
  dotted name, if the module is loaded by name (and needs to be referred
  to by name to avoid the "module defined in multiple files" error).
* The `LoadedModule` `Display` instance formats the module's import name
  correctly (with a dotted name if needed) and avoids hitting the disk
  or any string processing.
@9999years 9999years force-pushed the rebeccat/refactor-module-set branch from a93cc3f to ffde017 Compare June 21, 2024 20:58
GHC output contains quoted fragments:

    Module graph contains a cycle:
            module ‘C’ (./C.hs)
            imports module ‘A’ (A.hs)
      which imports module ‘B’ (./B.hs)
      which imports module ‘C’ (./C.hs)

When Unicode output is not available, the Unicode quotes are substituted
for GNU-style ASCII quotes:

    module `C' (./C.hs)

However, when the quoted text starts or ends with a single quote, ASCII
quotes are omitted. This leads to ambiguous output:

    A   → `A'
    A'  → A'
    `A' → `A'
    'A  → 'A
    'A' → 'A'

Correctly parsing this is challenging.

This probably increases the amount of backtracking and lookahead
required for these parsers. Not sure if that's significant or relevant.
Haskell source paths, as GHC understands them, are remarkably
permissive: they must end with one of the source extensions (now more
accurately listed here, with references to the upstream GHC code), but
can otherwise contain quirks up to and including multiple extensions,
whitespace, and newlines.

GHCi is actually even more lenient than this in what it accepts; it'll
automatically append `.hs` and `.lhs` to paths you give it and check if
those exist, but fortunately they get printed out in `:show targets` and
diagnostics as the resolved source paths:

```text
ghci> :add src/MyLib
[1 of 1] Compiling MyLib            ( src/MyLib.hs, interpreted )

ghci> :show targets
src/MyLib.hs

ghci> :add src/Foo
target ‘src/Foo’ is not a module name or a source file

ghci> :add src/MyLib.lhs
File src/MyLib.lhs not found

ghci> :add "src/ Foo.hs"
File src/ Foo.hs not found

ghci> :add "src\n/Foo.hs"
File src
/Foo.hs not found
```
This will help us keep the module set in sync more reliably.
@9999years 9999years force-pushed the rebeccat/dux-2341-parse-module-not-found-messages branch from 2bb67b8 to 492bfa1 Compare June 21, 2024 21:05
Base automatically changed from rebeccat/refactor-module-set to main June 21, 2024 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
patch Bug fixes or non-functional changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant