GH-49740: [C++][Python] Fix casts to view types leaving null variadic buffers#50166
Open
fenfeng9 wants to merge 5 commits into
Open
GH-49740: [C++][Python] Fix casts to view types leaving null variadic buffers#50166fenfeng9 wants to merge 5 commits into
fenfeng9 wants to merge 5 commits into
Conversation
pitrou
reviewed
Jun 15, 2026
| util::ToInlineBinaryView("hello"), | ||
| util::ToInlineBinaryView("world"), | ||
| }), | ||
| Raises(StatusCode::Invalid)); |
Member
There was a problem hiding this comment.
Can we test the error message somehow?
| nullptr})); | ||
|
|
||
| struct ArrowArray c_export; | ||
| ASSERT_RAISES(Invalid, ExportArray(*arr, &c_export)); |
Member
There was a problem hiding this comment.
Can we test the error message too?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Rationale for this change
Casting to
binary_vieworstring_viewcould leave a null variadic buffer slot when all values were inline. This could happen for casts frombinary,large_binary,string,large_string, andfixed_size_binary.The C Data Interface exporter reads every variadic buffer to get its size. Because of that, exporting such an array could crash, for example through PyArrow
_export_to_c.Validation also passed for these arrays. For all-inline view arrays, validation never needed to read an out-of-line data buffer.
What changes are included in this PR?
This PR fixes the cast kernels so all-inline view arrays do not keep a null variadic buffer slot.
It also makes validation reject null variadic buffer slots, and makes C Data export return an error instead of crashing.
C++ and Python regression tests cover the cast, validation, and export paths.
Are these changes tested?
Yes.
Are there any user-facing changes?
No.
This PR contains a "Critical Fix" Exporting an all-inline view array through the C Data Interface could crash the process while using only public APIs.
_export_to_csegmentation fault forbinary_viewarray #49740