fix(serializer): preserve Decimal values and non-primitive dict keys#1707
fix(serializer): preserve Decimal values and non-primitive dict keys#1707i-anubhav-anand wants to merge 2 commits into
Conversation
EventSerializer silently dropped data in two cases:
- decimal.Decimal had no branch, so it fell through to the "<Decimal>"
type fallback and the value was lost. Route it through float() so it is
serialized as a number (and Decimal("NaN")/Decimal("Infinity") reuse the
existing special-value handlers).
- A dict whose key serializes to a container/object (e.g. tuple, set,
custom object) raised inside the dict comprehension / json encoding,
which the except clause turned into "<not serializable ...>" for the
WHOLE dict. Stringify such keys via str(key) so one non-primitive key
no longer discards every value in the dict.
Adds unit tests covering both (fail before, pass after). Existing
datetime/uuid/enum/int key behavior is unchanged.
| if isinstance(obj, decimal.Decimal): | ||
| return self.default(float(obj)) |
There was a problem hiding this comment.
Decimal precision silently lost on float conversion
float() can only represent ~15–17 significant digits, so any Decimal with higher precision is silently rounded. For example, Decimal("1.0000000000000001") becomes 1.0 and Decimal("123456789012345678") would gain rounding error. Since Decimal is often chosen specifically to avoid float imprecision (financial, scientific), users may find their high-precision values quietly corrupted in trace data. A comment documenting this trade-off would help, or alternatively the value could be converted to str instead, preserving the exact decimal representation while still avoiding the opaque "<Decimal>" fallback.
Prompt To Fix With AI
This is a comment left during a code review.
Path: langfuse/_utils/serializer.py
Line: 79-80
Comment:
**Decimal precision silently lost on float conversion**
`float()` can only represent ~15–17 significant digits, so any `Decimal` with higher precision is silently rounded. For example, `Decimal("1.0000000000000001")` becomes `1.0` and `Decimal("123456789012345678")` would gain rounding error. Since `Decimal` is often chosen specifically to avoid float imprecision (financial, scientific), users may find their high-precision values quietly corrupted in trace data. A comment documenting this trade-off would help, or alternatively the value could be converted to `str` instead, preserving the exact decimal representation while still avoiding the opaque `"<Decimal>"` fallback.
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Addressed in the latest commit — switched to str(obj) so the exact Decimal value is preserved (this also avoids the OverflowError that float() raises on very large Decimals). NaN/Infinity still render as "NaN"/"Infinity". Tests updated to assert exact string output incl. high-precision and >JS-safe-int cases.
Address review feedback: float(Decimal) silently rounds high-precision values and can OverflowError on very large ones. Use str() to preserve the exact value (NaN/Infinity still render as "NaN"/"Infinity"/"-Infinity", matching the float handling). Update tests to assert exact string output incl. high-precision and >JS-safe-int cases.
Two cases where
EventSerializersilently dropped data in serialized trace input/output:1.
decimal.Decimalvalues are lostThere is no
Decimalbranch in_default_inner, so aDecimalfalls through to the"<{type}>"fallback:Decimalis common in financial/tool outputs. Fix serializes it viastr()so the exact value is preserved (float()would silently round high-precision values and can overflow on very large ones);Decimal("NaN")/Decimal("Infinity")still render as"NaN"/"Infinity".2. A non-primitive dict key discards the whole dict
The dict branch did
{self.default(k): self.default(v) ...}. When a key serializes to a container/object (tuple, set, custom object), the comprehension/JSON encoding raises and theexceptturns the entire dict into an error string:Fix stringifies such keys via
str(key)so a single non-primitive key no longer drops every value.Tests
Adds 4 unit tests in
tests/unit/test_serializer.py(verified fail-before / pass-after). Existingdatetime/uuid/enum/intkey behavior is unchanged (existing parametrized test still passes).ruff check/ruff formatclean.Type of change
Greptile Summary
This PR fixes two silent data-loss bugs in
EventSerializer:decimal.Decimalvalues were falling through to the"<Decimal>"opaque fallback, and a dict with any non-primitive key would discard the entire dict instead of just stringifying that key.Decimalis now routed throughfloat()before the existingNaN/Infinityhandlers, so it serializes as a JSON number. SpecialDecimalvalues (NaN,Infinity,-Infinity) are handled correctly. The precision trade-off (float has ~15–17 significant digits) is an intentional design choice but is undocumented.str(k)for non-scalars (tuples, sets, custom objects), so a single exotic key no longer discards every value in the dict.Confidence Score: 4/5
Safe to merge — both fixes address real silent-data-loss regressions, the logic is correct, and the new tests exercise the changed paths including edge cases.
The Decimal-to-float conversion works correctly for normal and special values, but silently narrows high-precision Decimal values to float precision without any documentation or warning. For a tracing library this is unlikely to cause problems in practice, but users who pass high-precision financial Decimals would see quietly rounded values in their trace data.
The Decimal precision trade-off in langfuse/_utils/serializer.py (lines 79–80) is the only area worth a second look; tests/unit/test_serializer.py is straightforward.
Prompt To Fix All With AI
Reviews (1): Last reviewed commit: "fix(serializer): preserve Decimal values..." | Re-trigger Greptile