Skip to content

[core] Support lightweight sequence initialization for write-only primary-key tables#8256

Open
Aitozi wants to merge 1 commit into
apache:masterfrom
Aitozi:mwj/write-only-pk-light-meta
Open

[core] Support lightweight sequence initialization for write-only primary-key tables#8256
Aitozi wants to merge 1 commit into
apache:masterfrom
Aitozi:mwj/write-only-pk-light-meta

Conversation

@Aitozi

@Aitozi Aitozi commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Purpose

Primary-key table writers currently need to scan existing metadata to initialize the max sequence number. For write-only workloads this can be heavier than necessary because the writer only needs a safe starting sequence number.

What changed

This PR adds a write.sequence-number-init-mode option with two modes:

  • scan: keep the existing behavior and scan restored files for the max sequence number.
  • snapshot: persist the max sequence number in snapshot properties and use it to initialize later write-only writers.

For write-only primary-key tables in snapshot mode, the writer can skip loading previous files once the latest snapshot carries the max sequence property. If the latest snapshot does not have the property yet, the writer scans once to bootstrap the snapshot property safely.

The snapshot property key remains sequence.generation.max-sequence-number for compatibility with existing metadata.

Tests

  • git diff --check
  • /Users/bytedance/work/software/maven/bin/mvn -s ~/.m2/apache-community.xml -o -pl paimon-core -am -Pfast-build -DfailIfNoTests=false -Dtest=KeyValueFileStoreWriteTest test
  • /Users/bytedance/work/software/maven/bin/mvn -s ~/.m2/apache-community.xml -o -pl paimon-flink/paimon-flink-common -am -Pfast-build -DfailIfNoTests=false -Dtest=PrimaryKeyFileStoreTableITCase#testWriteOnlySnapshotSequenceNumberInitOverwritePreviousValue test
  • /Users/bytedance/work/software/maven/bin/mvn -s ~/.m2/apache-community.xml package -Pgenerate-docs -pl paimon-docs -nsu -DskipTests -am failed before paimon-docs at paimon-spark-common_2.12 with existing API mismatches such as readProtectionTagName, vectorSearchDistributeEnabled, and scanPlanAutoTagTimeRetained not found.

@Aitozi Aitozi force-pushed the mwj/write-only-pk-light-meta branch from 2d89e5b to 4d57ea7 Compare June 16, 2026 15:46
@Aitozi Aitozi marked this pull request as ready for review June 16, 2026 16:17
@JingsongLi

Copy link
Copy Markdown
Contributor

@Aitozi What do you think about #7832 ? cc @JunRuiLee These two PRs seem to be addressing the same issue.

@Aitozi

Aitozi commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

@JingsongLi They are related only in that both touch sequence metadata, but they solve different problems. #7832 changes merge ordering semantics by using snapshot id as the ordering sequence. This PR does not change ordering semantics; it only optimizes how a write-only primary-key writer initializes the next sequence number, by persisting the global max sequence number in snapshot properties and avoiding scanning existing file metadata after bootstrap.

@JingsongLi

Copy link
Copy Markdown
Contributor

@JingsongLi They are related only in that both touch sequence metadata, but they solve different problems. #7832 changes merge ordering semantics by using snapshot id as the ordering sequence. This PR does not change ordering semantics; it only optimizes how a write-only primary-key writer initializes the next sequence number, by persisting the global max sequence number in snapshot properties and avoiding scanning existing file metadata after bootstrap.

This is super confusing from the config option name. We may need a better name, for example, commit.record-max-sequence, something like this, you can use AI to find a new one.

@JunRuiLee

Copy link
Copy Markdown
Contributor

Thanks for cc'ing me. I checked the diff and agree @Aitozi this does not overlap with #7832: #7832 changes merge ordering semantics, while this PR persists/restores the max generated sequence number to avoid scanning existing file metadata for write-only PK writers.

@Aitozi Aitozi force-pushed the mwj/write-only-pk-light-meta branch from 4d57ea7 to 09da5d1 Compare June 18, 2026 13:33
@Aitozi Aitozi changed the title [core] Support snapshot sequence for write-only primary-key tables [core] Support lightweight sequence initialization for write-only primary-key tables Jun 18, 2026
@Aitozi

Aitozi commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

@JingsongLi They are related only in that both touch sequence metadata, but they solve different problems. #7832 changes merge ordering semantics by using snapshot id as the ordering sequence. This PR does not change ordering semantics; it only optimizes how a write-only primary-key writer initializes the next sequence number, by persisting the global max sequence number in snapshot properties and avoiding scanning existing file metadata after bootstrap.

This is super confusing from the config option name. We may need a better name, for example, commit.record-max-sequence, something like this, you can use AI to find a new one.

Updated to write.sequence-number-init-mode @JingsongLi

@JingsongLi

Copy link
Copy Markdown
Contributor

Reviewed the updated lightweight sequence initialization path. The snapshot mode now bootstraps from existing manifest metadata when the latest snapshot does not yet carry the max-sequence property, persists the max across later commits, and the writer only skips previous-file scans for write-only PK tables once that snapshot property is available. The fallback keeps upgrade/config-switch cases safe, and the added tests cover both bootstrap and restore behavior. LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants