This is an English translation of the following article: https://dev.classmethod.jp/articles/modern-data-stack-info-summary-20250625/
Hi, this is Sagara.
As a consultant in the Modern Data Stack field, I keep up with the vast amount of information that is released daily.
In this article, I'll summarize the Modern Data Stack-related news that has caught my eye over the past couple of weeks.
Please note: This article does not cover all the latest updates for the products mentioned. It only includes information that I found interesting based on **my personal selection and perspective*.
Modern Data Stack in General
Perspectives on the Snowflake and Databricks Summits
The CEO of Orchestra has published an article summarizing his views on the recent summits held by Snowflake and Databricks.
The article points out that Snowflake is emphasizing "ease of use" to bring more users into its ecosystem. New features that symbolize this strategy include Snowflake Intelligence and Cortex Agents for natural language data exploration, Snowflake Openflow for simplifying data ingestion, and native support for dbt Projects.
The author contrasts this by stating that while Snowflake aims for an "open ecosystem" to appeal to a broad audience, Databricks is taking a strategy of expanding more specialized use cases for its existing technical user base. He concludes that as Postgres becomes crucial as a "memory" for AI agents, both companies are changing the shape of their competition by targeting different personas.
https://dataopsleadership.substack.com/p/the-snowflake-and-databricks-chess
Data Extract/Load
Airbyte
Airbyte's Latest Version "1.7" Released
Airbyte has released its latest version, 1.7.
I felt the biggest update was the new support for transferring files and their metadata. (Currently, this is limited to certain connectors like Zendesk Support and only S3 as a destination.)
https://github.com/airbytehq/airbyte/releases/tag/v1.7.0
https://docs.airbyte.com/release_notes/v-1.7
Omnata
A New Version with a Revamped UI is Coming Soon
An Omnata product update article has been published, mentioning that a version with a new UI will be released soon.
It seems this new UI will utilize Streamlit's Custom Components.
https://omnata.com/blog-detail/omnata-sync-product-updates-june-2025
https://www.youtube.com/watch?v=PpI-bJUvoqI&list=TLGGue03jRN4cAoyMzA2MjAyNQ&t=14s
Data Warehouse/Data Lakehouse
Snowflake
Recap Articles on Snowflake Summit 2025
Snowflake Summit 2025 has concluded, and several recap and review articles have been posted about the announcements.
https://medium.com/snowflake/snowflake-summit-2025-builders-recap-bddf3d5b7de3
https://hex.tech/blog/snowflake-summit-2025-recap/
https://www.lightdash.com/blogpost/snowflake-summit-2025
https://www.selectstar.com/resources/snowflake-summit-2025
"UNION BY NAME" Released for Uniting Tables with Matching Column Names
"UNION BY NAME" has been released, allowing SQL UNION operations to be performed based on matching column names rather than column order.
I tried it out myself, so I hope the following article is also helpful.
https://dev.classmethod.jp/articles/snowflake-try-union-by-name/
Latest Snowflake Articles from SELECT
SELECT, known for its highly informative articles on Snowflake, has published two new blog posts.
- On Utilizing Tags
https://select.dev/posts/snowflake-object-tags-guide
- On CI/CD and DevOps in Snowflake Part 1 only
https://select.dev/posts/ci-cd-and-devops-in-snowflake-part-1
Databricks
DATA + AI SUMMIT 2025 Was Held
Databricks' largest annual event, the DATA + AI SUMMIT 2025, took place.
https://www.databricks.com/dataaisummit
Here are the announcements that particularly caught my attention:
-
Lakebase
- A fully managed PostgreSQL database integrated with the Lakehouse.
- Lakebase leverages an architecture that separates compute and storage, supporting low latency (< 10 ms) and high-concurrency transactions (> 10k qps).
- Features data branching, similar to Git branches, using technology from the acquired company Neon (Reference: X post).
- Lakebase can be synchronized with tables managed by Unity Catalog.
- Official Blog: Announcing Lakebase Public Preview
- Official Blog: What Is a Lakebase?
- Official Documentation
- Databricks Free Edition Now Available
-
Agent Bricks
- A feature for building high-quality, domain-specific agents by describing tasks.
- Supports automatic creation of evaluation benchmarks and auto-optimization. (According to this X post, it seems to automatically record and evaluate responses to user requests).
- Official Blog: Introducing Agent Bricks: Auto-Optimized Agents Using Your Data
- General Availability of Databricks Apps
- Release of MLflow 3.0
- Mosaic AI Gateway is Generally Available
-
MCP Support
- Managed MCP servers that can access Unity Catalog.
- Custom MCP servers that allow users to host any MCP Server as a Databricks app.
- Official Documentation
- Official Blog: Announcing managed MCP servers with Unity Catalog and Mosaic AI Integration
-
Announcing Apache Iceberg Support in Databricks (Public Preview)
- Create Iceberg tables managed by Unity Catalog using Databricks or external engines.
- Implementation of the Iceberg REST Catalog API in Unity Catalog.
- Official Blog: Announcing full Apache Iceberg™ support in Databricks
-
Delta Sharing & Marketplace Updates
- Cross-platform delivery to Iceberg-compatible engines including Snowflake, and the ability to share Iceberg tables via Delta Sharing (Private Preview).
- Simplified network configuration for external sharing with the Delta Sharing Network Gateway (Private Preview).
- Zero-copy access to data from partner products like SAP Business Data Cloud (SAP Business Data Cloud integration coming soon).
- Official Blog: What’s New with Data Sharing and Collaboration - Summer 2025
-
Databricks Lakeflow is Generally Available, and Lakeflow Designer for no-code pipeline implementation announced
- Lakeflow Connect: Connectors to external services.
- Lakeflow Declarative Pipelines: Data pipeline development feature using Spark's Declarative Pipelines, compatible with DLT. It also has a dedicated IDE. (The DLT documentation has become the documentation for this feature, so it might be a rebranding of DLT.)
- Lakeflow Jobs: Job orchestration feature (formerly Databricks Workflows).
- Lakeflow Designer: A no-code pipeline builder with drag-and-drop and natural language support. The output is generated as Lakeflow Declarative Pipelines code (Lakeflow Designer in private preview within a few months).
- Official Blog: Announcing the General Availability of Databricks Lakeflow
- Official Blog: Announcing Lakeflow Designer: No-Code ETL, Powered by the Databricks Intelligence Platform
-
Databricks SQL
- Since 2022, updates have resulted in a 5x performance improvement on actual customer workloads, and the latest release automatically provides a 25% performance boost at no extra cost. These updates are automatically rolled out to serverless SQL warehouses.
- Official Blog: Databricks SQL accelerates customer workloads by 5x in just three years
-
AI/BI Genie is Generally Available
- AI/BI Genie is a feature that allows users to ask questions about their data in natural language to gain insights.
- Official Blog: AI/BI Genie is now Generally Available
-
Databricks One (expected to be widely available in public beta this summer)
- Designed for business users, this feature provides natural language access to AI/BI dashboards, AI/BI Genie, and Databricks apps from a single screen.
- Official Blog: Introducing Databricks One
-
New Unity Catalog Features
- Full support for the Iceberg REST Catalog API, enabling external engines to read (GA) and write (Public Preview) to Iceberg tables managed by Unity Catalog.
- With Unity Catalog Metrics, once a metric is created, it can be extended not only within Databricks but also to BI tools like Tableau, Hex, Sigma, ThoughtSpot, and Omni, and observability tools like Anomalo and Monte Carlo (currently in Public Preview, GA later this summer).
- Data quality monitoring features are also provided (Beta).
- Official Blog: What’s new with Databricks Unity Catalog at Data + AI Summit 2025
MotherDuck/DuckDB
Apache Arrow Flight SQL
MotherDuck's blog has published an article that presents performance challenges when using DuckDB via REST APIs or JDBC and offers a solution.
The article identifies the overhead of JSON serialization and row-oriented protocols as issues and introduces "Apache Arrow Flight SQL," which transfers columnar data (Apache Arrow) directly over gRPC, as a solution.
It also features implementation examples of open-source servers "Hatch" and "GizmoSQL" that enable this protocol in DuckDB, making it a valuable resource for those considering optimizing their data delivery architecture.
https://motherduck.com/blog/flight-sql-vs-rest-vs-jdbc/
Data Transform
dbt
dbt Core 1.10 Released
The latest version of dbt Core, 1.10, has been released.
Key updates include the --sample
flag, Hybrid Projects for seamlessly uploading dbt Core artifacts to dbt Cloud, and moving freshness
checks under the config:
block.
https://github.com/dbt-labs/dbt-core/releases/tag/v1.10.0
https://docs.getdbt.com/docs/dbt-versions/core-upgrade/upgrading-to-v1.10
dbt Labs Employee Discusses the Appeal of the dbt Insights Feature
An article on the dbt Labs blog features an employee sharing their experience with the dbt Insights feature and its appeal.
The article highlights the ability to navigate directly from dbt Catalog documentation to a query writing screen and how dbt Copilot helps to smoothly start an analysis.
Furthermore, it covers the feature that auto-completes the dbt Semantic Layer query syntax and visualizes the SQL running behind the metrics. The author states that this allows analysts to work with reliable data more quickly and confidently, contributing to the entire team's productivity.
https://www.getdbt.com/blog/why-this-analyst-is-obsessed-with-dbt-insights
I have also tried dbt Insights and written an article about it, which I hope you find useful.
https://dev.classmethod.jp/articles/dbt-cloud-model-insights/
Business Intelligence
General
Gartner's Latest Report on Analytics and Business Intelligence Platforms is Published
Gartner has released its latest report for the Analytics and Business Intelligence Platforms sector.
The following is a link to a Google blog post, but it includes the graph showing the evaluation of various BI tools, so please take a look.
Looker
Looker 25.10 Release Notes Published, Including CI Feature
The release notes for Looker's latest version, 25.10, have been published.
https://cloud.google.com/looker/docs/release-notes
A key feature is the release of a Continuous Integration (CI) feature in preview.
https://cloud.google.com/looker/docs/continuous-integration
Tableau
Tableau 2025.2 Released
The latest version of Tableau, 2025.2, has been released.
https://www.tableau.com/support/releases
The following page provides a good overview of the release contents.
https://www.tableau.com/products/all-features
Apache Superset
Apache Superset 5.0.0 Released
The latest version of Apache Superset, 5.0.0, has been released.
It includes a refreshed UI, improved dashboard responsiveness, and more.
https://preset.io/blog/superset-5-0-0-release-notes/
Data Catalog
Select Star
Select Star's June 2025 Release
Select Star's Change Log has information about its June 2025 release.
Updates include the ability to automatically generate Snowflake Semantic Views/Models from BI tools, a natural language search feature, and support for dbt's custom tests.
Data Quality & Data Observability
Monte Carlo
Ranked #1 in G2's Data Observability Category for the 8th Consecutive Quarter
Monte Carlo announced on their blog that they have been ranked #1 in G2's Data Observability category for the eighth consecutive quarter.
Top comments (0)