Skip to content

Remove Dv2 Null checks#31149

Merged
Edward Gao (edgao) merged 22 commits into
masterfrom
evan/remove-null-checks
Oct 10, 2023
Merged

Remove Dv2 Null checks#31149
Edward Gao (edgao) merged 22 commits into
masterfrom
evan/remove-null-checks

Conversation

@evantahler
Copy link
Copy Markdown
Contributor

@evantahler Evan Tahler (evantahler) commented Oct 6, 2023

This PR removes null-PK checks from the destinations. The sources really shouldn't be emitting records with null PKs, but if they are, we should accept them. Yes, this will possibly lead to strange deduplication behavior... but that's been the case for months now. This makes syncs more likely to succeeded. Detecting records which have null primary keys should be the job of the platform (#31186).

This is also a performance improvement as we remove a raw table scan looking for null-PK records.

We will do another soft-reset to remove the NOT NULL constraints on PK columns in the final tables which may have been added while #30779 was active (rolled back by #31082)

@vercel
Copy link
Copy Markdown

vercel Bot commented Oct 6, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Oct 10, 2023 7:12pm
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Oct 6, 2023

Before Merging a Connector Pull Request

Wow! What a great pull request you have here! 🎉

To merge this PR, ensure the following has been done/considered for each connector added or updated:

  • PR name follows PR naming conventions
  • Breaking changes are considered. If a Breaking Change is being introduced, ensure an Airbyte engineer has created a Breaking Change Plan.
  • Connector version has been incremented in the Dockerfile and metadata.yaml according to our Semantic Versioning for Connectors guidelines
  • You've updated the connector's metadata.yaml file any other relevant changes, including a breakingChanges entry for major version bumps. See metadata.yaml docs
  • Secrets in the connector's spec are annotated with airbyte_secret
  • All documentation files are up to date. (README.md, bootstrap.md, docs.md, etc...)
  • Changelog updated in docs/integrations/<source or destination>/<name>.md with an entry for the new version. See changelog example
  • Migration guide updated in docs/integrations/<source or destination>/<name>-migrations.md with an entry for the new version, if the version is a breaking change. See migration guide example
  • If set, you've ensured the icon is present in the platform-internal repo. (Docs)

If the checklist is complete, but the CI check is failing,

  1. Check for hidden checklists in your PR description

  2. Toggle the github label checklist-action-run on/off to re-run the checklist CI.

@evantahler Evan Tahler (evantahler) changed the title Evan/remove null checks Remove Dv2 Null checks Oct 9, 2023
@octavia-squidington-iii Octavia Squidington III (octavia-squidington-iii) added the area/documentation Improvements or additions to documentation label Oct 9, 2023
@evantahler Evan Tahler (evantahler) marked this pull request as ready for review October 9, 2023 18:11
@evantahler Evan Tahler (evantahler) requested a review from a team as a code owner October 9, 2023 18:11
@airbyte-oss-build-runner

This comment was marked as outdated.

@airbyte-oss-build-runner

This comment was marked as outdated.

@airbyte-oss-build-runner

This comment was marked as outdated.

@edgao
Copy link
Copy Markdown
Contributor

will do a fuller review later, but can you write tests for bq+snowflake to test the migration e2e? a la https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/destination-snowflake/src/test-integration/java/io/airbyte/integrations/destination/snowflake/typing_deduping/AbstractSnowflakeTypingDedupingTest.java#L160

(i.e. run a sync on a specific version, then run a sync on the dev version)

I think that's more powerful than doing it in the sqlgenerator tests - one of the rare instances where I actually prefer dockerized tests

@evantahler
Copy link
Copy Markdown
Contributor Author

Evan Tahler (evantahler) commented Oct 9, 2023

I added DAT/integration tests in 289bd75, but they are failing because the first sync with the older connector version is not throwing like it should. I don't think I'm creating the catalog properly to include PKs for id1 and id2. Help please! In the logs I'm not seeing "ID1" NUMBER not "ID1" NUMBER NOT NULL as expected

Edit: figured it out!

@airbyte-oss-build-runner

This comment was marked as outdated.

@airbyte-oss-build-runner

This comment was marked as outdated.

@airbyte-oss-build-runner

This comment was marked as outdated.

@airbyte-oss-build-runner

This comment was marked as outdated.

@airbyte-oss-build-runner

This comment was marked as outdated.

@airbyte-oss-build-runner

This comment was marked as outdated.

Copy link
Copy Markdown
Contributor

@edgao Edward Gao (edgao) left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

Had some comments around code cleanliness, lmk if you want me to just push those myself + get this merged

Comment thread docs/integrations/destinations/bigquery.md Outdated
Comment thread docs/integrations/destinations/snowflake.md Outdated
@airbyte-oss-build-runner

This comment was marked as outdated.

@airbyte-oss-build-runner
Copy link
Copy Markdown
Collaborator

destination-bigquery test report (commit 7a909ccba2) - ✅

⏲️ Total pipeline duration: 09mn02s

Step Result
Build connector tar
Build destination-bigquery docker image for platform(s) linux/x86_64
Java Connector Unit Tests
Java Connector Integration Tests
Validate metadata for destination-bigquery
Connector version semver check
Connector version increment check
QA checks

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-bigquery test
@airbyte-oss-build-runner
Copy link
Copy Markdown
Collaborator

destination-snowflake test report (commit 7a909ccba2) - ✅

⏲️ Total pipeline duration: 11mn24s

Step Result
Build connector tar
Build destination-snowflake docker image for platform(s) linux/x86_64
Java Connector Unit Tests
Java Connector Integration Tests
Validate metadata for destination-snowflake
Connector version semver check
Connector version increment check
QA checks

🔗 View the logs here

☁️ View runs for commit in Dagger Cloud

Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command

airbyte-ci connectors --name=destination-snowflake test
@edgao Edward Gao (edgao) merged commit 898846d into master Oct 10, 2023
@edgao Edward Gao (edgao) deleted the evan/remove-null-checks branch October 10, 2023 19:36
Aries Gunawan (ariesgun) pushed a commit to ariesgun/airbyte that referenced this pull request Oct 23, 2023
Co-authored-by: evantahler <evantahler@users.noreply.github.com>
Co-authored-by: Edward Gao <edward.gao@airbyte.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4 participants