Skip to content

Disable file creation when Partition does not exist #20133#20151

Merged
electrum merged 1 commit into
trinodb:masterfrom
willzgw:fix_issue_20133
Mar 11, 2024
Merged

Disable file creation when Partition does not exist #20133#20151
electrum merged 1 commit into
trinodb:masterfrom
willzgw:fix_issue_20133

Conversation

@willzgw
Copy link
Copy Markdown
Contributor

@willzgw willzgw commented Dec 18, 2023

Description

  • [HUDI] Disable file creation when Partition does not exist (Fix issue 20133)

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Section
* Fix HUDI connector create files when partitions does not exist. ({issue}`20133`)
@cla-bot cla-bot Bot added the cla-signed label Dec 18, 2023
@github-actions github-actions Bot added the hudi Hudi connector label Dec 18, 2023
@willzgw willzgw requested a review from codope December 18, 2023 03:36
@Akanksha-kedia
Copy link
Copy Markdown

@willzgw
Disable file creation when Partition does not exist its a good option but i wanted to understand why there is a mismatch between hms(metadata) and hdfs(actual data). as soon as hdfs gets updated the metadata also should be in sync.

@github-actions
Copy link
Copy Markdown

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

@github-actions github-actions Bot added the stale label Jan 10, 2024
@mosabua
Copy link
Copy Markdown
Member

mosabua commented Jan 11, 2024

Wdyt @codope .. could you review/chime in?

Also cc @findepi @brandylove

@github-actions github-actions Bot removed the stale label Jan 11, 2024
@codesorcery
Copy link
Copy Markdown
Contributor

We also need this merged, since otherwise reading Hudi tables with empty partitions will fail when they are read from a read-only volume. This will then lead to the query being in a stuck state as described in #19506

@github-actions
Copy link
Copy Markdown

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

@github-actions github-actions Bot added the stale label Feb 13, 2024
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 6, 2024

Closing this pull request, as it has been stale for six weeks. Feel free to re-open at any time.

@github-actions github-actions Bot closed this Mar 6, 2024
@codesorcery codesorcery reopened this Mar 7, 2024
@codesorcery
Copy link
Copy Markdown
Contributor

Since this PR has been open for almost 4 months: What steps are there to do, to get this one merged?

We are currently forced to maintain our own fork to apply this fix, but would prefer to go back to directly using the upstream releases.

Tagging everyone the github-actions bot tagged 3 weeks ago: @bitsondatadev @colebow @mosabua

if (fileIterator.hasNext()) {
return fileIterator;
}
try (OutputStream ignored = metaClient.getFileSystem().newOutputFile(partitionLocation).create()) {
Copy link
Copy Markdown
Contributor

@codope codope Mar 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why this was getting created in the first place. Hudi connector does not support writes. In my opinion, the fix is valid, simply return empty iterator when partition does not exist.
@anusudarsan @homar Do you recall any specific scenario why this was added?

@bitsondatadev
Copy link
Copy Markdown
Member

@dain would you mind taking a look?

@github-actions github-actions Bot removed the stale label Mar 7, 2024
@dirksan28
Copy link
Copy Markdown

Hi there, I'm desperately waiting for this fix. Is there anything one could do to help getting it merged?

@codope
Copy link
Copy Markdown
Contributor

codope commented Mar 11, 2024

@dain @electrum @mosabua Please review.

Copy link
Copy Markdown
Contributor

@wendigo wendigo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a test for that?

@electrum
Copy link
Copy Markdown
Member

I don't think a test is required, as this is extra code which shouldn't exist. It seems strange to have a test for "make sure code doesn't randomly create some file".

@electrum electrum merged commit 9ea89c8 into trinodb:master Mar 11, 2024
@github-actions github-actions Bot added this to the 441 milestone Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed hudi Hudi connector

10 participants