Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser Improvement: Support ” along with " #1074

Open
hellovai opened this issue Oct 21, 2024 · 0 comments · May be fixed by #1249
Open

Parser Improvement: Support ” along with " #1074

hellovai opened this issue Oct 21, 2024 · 0 comments · May be fixed by #1249

Comments

@hellovai
Copy link
Contributor

LLMs will sometimes output something like this:

Sure! Here is a made-up JSON blob that matches the schema you provided:
\```
{
  "prop1": "example",
  "prop2": {
    "prop1": "value1",
    "prop2": "value2”,
    "inner": {
      "prop2": 42,
      "prop3": 3.14,
    }
  }
}
\```

where it would parse if was ". We can likely correct for this.

revidious added a commit to revidious/baml that referenced this issue Dec 16, 2024
LLMs sometimes output JSON with curly quotes (U+201C/U+201D) instead of straight quotes (U+0022). This change adds automatic normalization of these quotes during parsing, making the JSON parser more robust to LLM output variations.

Changes:\n- Add normalize_quotes() function to convert curly quotes to straight quotes\n- Apply normalization before all parsing attempts (serde, markdown, multi-json)\n- Preserve original string for error messages and output\n\nFixes BoundaryML#1074
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant