[core] Rewrite oss tryToWriteAtomic using atomic putObject API (#8226)#8228
[core] Rewrite oss tryToWriteAtomic using atomic putObject API (#8226)#8228MaxLinyun wants to merge 1 commit into
Conversation
18e4e42 to
a9c77ce
Compare
|
@JingsongLi Hi, could you help review this PR? |
|
|
||
| ObjectMetadata metadata = new ObjectMetadata(); | ||
| metadata.setContentLength(bytes.length); | ||
| metadata.setHeader("x-oss-forbid-overwrite", "true"); |
There was a problem hiding this comment.
The direct SDK upload no longer carries fs.oss.server-side-encryption-algorithm into ObjectMetadata. The Hadoop OSS write path that this replaces sets ObjectMetadata#setServerSideEncryption before putObject; with catalogs configured for OSS SSE this can either violate bucket policies that require encrypted uploads or create unencrypted metadata/version files. Could we set the same SSE header from the configured Hadoop options/store before calling putObject?
There was a problem hiding this comment.
@JingsongLi Thanks for taking the time to review this PR. I've made modifications following your review suggestions, could you please take another look when you have time?
e6367d2 to
fb2d5c8
Compare
…of the OSS SDK (apache#8226) Paimon's existing OSSFileIO inherits from HadoopCompliantFileIO, with file operations implemented underneath via Hadoop's AliyunOSSFileSystem. In object storage scenarios, the default implementation of tryToWriteAtomic follows the pattern of 'writing a temporary file followed by renaming'. However, renaming on OSS is essentially a copy-then-delete process and not an atomic operation. Rewrite the implementation of tryToWriteAtomic, and directly call the conditional write API (put-if-absent) of the OSS SDK, so as to implement the atomic 'write if not exists' semantics without relying on external locks.
fb2d5c8 to
13b6f1a
Compare
Purpose
closes #8226
Paimon's existing OSSFileIO inherits from HadoopCompliantFileIO, with file operations implemented underneath via Hadoop's AliyunOSSFileSystem. In object storage scenarios, the default implementation of tryToWriteAtomic follows the pattern of 'writing a temporary file followed by renaming'. However, renaming on OSS is essentially a copy-then-delete process and not an atomic operation.
Rewrite the implementation of tryToWriteAtomic, and directly call the conditional write API (put-if-absent) of the OSS SDK, so as to implement the atomic 'write if not exists' semantics without relying on external locks.
Tests
Since I cannot put oss ak/sk to test case, I don't know how to write test case.