Fix yyjson parity with JSONSerialization#328
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates the yyjson-backed YYJSONParser to better handle real-world tokenizer JSON files that previously loaded via Foundation-based parsing, and adds tests to lock in behavior (including edge cases like trailing commas and NULs in strings).
Changes:
- Relax yyjson read options to allow trailing commas, BOM prefixes, and comments.
- Fix string/key decoding to preserve embedded NUL (
\u0000) by decoding using yyjson’s reported byte length. - Add expanded unit tests, including a new Foundation-vs-YYJSON comparison suite.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| Sources/Hub/YYJSONParser.swift | Enables permissive yyjson read flags and fixes string/key decoding to preserve embedded NUL bytes. |
| Tests/TokenizersTests/YYJSONParserTests.swift | Adds coverage for trailing commas/BOM/comments and introduces a Foundation comparison test suite. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Sources/Hub/YYJSONParser.swift
Outdated
| YYJSON_READ_ALLOW_INF_AND_NAN | ||
| | YYJSON_READ_ALLOW_TRAILING_COMMAS | ||
| | YYJSON_READ_ALLOW_BOM | ||
| | YYJSON_READ_ALLOW_COMMENTS |
There was a problem hiding this comment.
Do we have any examples in the wild of config files with comments? Maybe it's overly defensive, but I'd prefer to keep this narrowly tailored.
There was a problem hiding this comment.
Removed this and the related tests with 63e0d32. Happy to revert/reapply that in a follow-up as needed.
mattt
left a comment
There was a problem hiding this comment.
@ZachNagengast Thank you for this!
LGTM after removing allowance for comments (which aren't supported by JSONSerialization).
|
Good catch, appreciated the review! |
We noticed a breaking change from #304 with one of our tokenizers that had trailing commas in it.
yyjson read failed (code 7) at position 118: trailing comma is not allowed.Luckily yyjson has a flag for this (expanding on @ronaldmannak's work in #324) so it can be brought back into parity with the previous JSONSerialization, but it requires a new release because protocolization was decided against.
This PR adds flags to reduce the validation error checks down to a level that JSONSerialization was allowing before.
Along with this, added some comprehensive tests to make sure we stay in parity with JSONSerialization in the future.
Through writing these tests, I also found another breaking change regarding NUL characters where
String(cString: "before\u0000after")would stop parsing after the NUL part, returningbeforeonly instead ofbeforeafter, but this was on the swift side not related to yyjson.