gh-150771: Fix email serialization for shift_jis and euc-jp#151120
Open
bhuvi27 wants to merge 6 commits into
Open
gh-150771: Fix email serialization for shift_jis and euc-jp#151120bhuvi27 wants to merge 6 commits into
bhuvi27 wants to merge 6 commits into
Conversation
|
Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool. If this change has little impact on Python users, wait for a maintainer to apply the |
5928e5b to
f071b32
Compare
Convert surrogate-escaped payloads through the input charset before encoding to iso-2022-jp, fixing UnicodeEncodeError when printing messages created with set_content().
f071b32 to
6795f58
Compare
…/euc-jp Encode the payload with the charset output mapping (iso-2022-jp) when set_content is called with shift_jis or euc-jp, instead of patching serialization in body_encode and set_payload. Reverts those changes.
Use plain backticks for set_content() instead of a broken :func: target.
serhiy-storchaka
approved these changes
Jun 14, 2026
serhiy-storchaka
left a comment
Member
There was a problem hiding this comment.
LGTM. 👍
Do you think an assertion for bytes(m) would be useful?
| self.assertEqual(m['Content-Type'], 'text/plain; charset="iso-2022-jp"') | ||
| self.assertEqual(m.get_payload(decode=True), content.encode('iso-2022-jp')) | ||
| self.assertEqual(m.get_content(), content) | ||
| self.assertEqual(str(m), textwrap.dedent("""\ |
Member
There was a problem hiding this comment.
Maybe add also assertions for bytes(m) similar to test_set_text_charset_cp949.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #150771
Creating a message with set_content(..., charset='shift_jis') or charset='euc-jp' raised UnicodeEncodeError on str(m) because the payload was encoded with the input charset while the Content-Type uses the output
charset (iso-2022-jp).
Use Charset.output_charset in set_text_content so the payload and Content-Type agree from the start.