Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some serialization cleanup, optimization, fixes #1864

Closed
wants to merge 3 commits into from

Conversation

colinator27
Copy link
Member

@colinator27 colinator27 commented Aug 12, 2024

Description

  • Removed more cases of #if DEBUG, to ensure consistent behavior on Release builds.
  • Added more safety checks to binary readers, and to deserialization in general (preventing unintended negative lengths/jumps that could lead to infinite loops).
  • Fixed EmSize on fonts (when in float form) being accidentally always parsed as 0, due to a typo.
  • Completely refactored code instruction deserialization, removing many calls and I/O operations.
  • Commented and reformatted code instruction deserialization/serialization.
  • Fixed QOI padding being added when saving, when none seems to have ever actually existed.

Caveats

Has potential to break some instruction parsing if I typo'd somewhere, but I did test it on multiple games (including Undertale 1.00, a bytecode 14 game), and it seems to work fine. If it does break, it'll be pretty obvious, as re-saving a data.win file will show differences in the code.

Notes

Probably too risky to merge into stable, although in the event that we make a minor release, we may want to backport the EmSize typo fix, as that is pretty diabolical. It affects the Deltarune Chapter 1 & 2 demo, version 1.15 (which I tested this PR on).

- Removed more cases of `#if DEBUG`, to ensure consistent behavior on Release builds.
- Added more safety checks to binary readers, and to deserialization in general (preventing unintended negative lengths/jumps that could lead to infinite loops).
- Fixed `EmSize` on fonts (when in float form) being accidentally always parsed as 0, due to a typo.
- Completely refactored code instruction deserialization, removing many calls and I/O operations
- Commented and reformatted code instruction deserialization/serialization
@colinator27 colinator27 requested a review from Miepee August 12, 2024 20:59
Copy link

github-actions bot commented Aug 12, 2024

Seems like it was never there
@BenjaminUrquhart
Copy link
Member

The EmSize typo prevents the Xbox version of Undertale from saving 1 to 1. Might be worth its own small PR.

@Miepee
Copy link
Contributor

Miepee commented Aug 18, 2024

the emsize fix should be removed from here now

Copy link
Contributor

@Miepee Miepee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only reviewed the changes in UndertaleCode so you have something to work with.
Please split the PR up into the following

  • #if debug changes
  • code commenting in UTcode serialization
  • code refactor in UTcode deserialization
  • binary reader checks
  • QOI padding fix.

@@ -21,10 +21,8 @@ public long Position
get => Stream.Position;
set
{
#if DEBUG
if (value > Length)
if (value < 0 || value > Length)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, can be changed into a pattern

@@ -86,10 +86,8 @@ public long AbsPosition
{
if (isUsingBufferReader)
{
#if DEBUG
if (value > Length)
if (value < 0 || value > Length)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above

@@ -110,6 +110,8 @@ Opcode.PushBltn or Opcode.PushI
_ => throw new IOException("Unknown opcode " + op.ToString().ToUpper(CultureInfo.InvariantCulture)),
};
}

// Converts from bytecode 14 instruction opcodes to modern opcodes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proper docstring please

else
{
// Bytecode 15 and above encode a comparison kind outside of the opcode
writer.Write((byte)ComparisonKind);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mental note: check where comparisonKind is all used, and potentially make it a byte enum
unsure if thats addressed by underanalyzer.


case InstructionType.PopInstruction:
{
if (Type1 == DataType.Int16)
{
// Special scenario - the swap instruction
// TODO: Figure out the proper syntax, see #129
// Special scenario - the swap instruction (see #129)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally we'd have a proper wiki for this than linking to issues lol

@@ -382,22 +384,39 @@ public Reference<T> GetReference<T>(bool allowResolve = false) where T : class,
/// <inheritdoc />
public void Serialize(UndertaleWriter writer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a more general note, but with how many small writing operations are done, i wonder if it makes sense to buffer them somewhere and then write them later all at once.

{
throw new IOException($"Invalid padding in {kind.ToString().ToUpper(CultureInfo.InvariantCulture)}");
}
if (instructionType == InstructionType.SingleTypeInstruction && type2 != 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newline in between please

}
}

// Cheeck for "and.b.b" or "or.b.b", which imply the code was compiled without short-circuiting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Cheeck for "and.b.b" or "or.b.b", which imply the code was compiled without short-circuiting
// Check for "and.b.b" or "or.b.b", which imply the code was compiled without short-circuiting

if (JumpOffset == -1048576) // magic? encoded in little endian as 00 00 F0, which is like below
JumpOffsetPopenvExitMagic = true;
reader.Position++;
// Bytecode 14 has slightly simplified parsing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Bytecode 14 has slightly simplified parsing
// Bytecode 14 has slightly different parsing

// TODO: Figure out the proper syntax, see #129
SwapExtra = (ushort)TypeInst;
TypeInst = 0;
// Special scenario - the swap instruction (see #129)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

@colinator27 colinator27 deleted the serialization-cleanup branch August 23, 2024 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants