Skip to content

feat: match empty column when in entityCollecting context#457

Open
JackWang032 wants to merge 17 commits into
feat/emptyColumnPreparefrom
feat/emptyColumn
Open

feat: match empty column when in entityCollecting context#457
JackWang032 wants to merge 17 commits into
feat/emptyColumnPreparefrom
feat/emptyColumn

Conversation

@JackWang032
Copy link
Copy Markdown
Collaborator

@JackWang032 JackWang032 commented Dec 17, 2025

当输入的SQL不完整时,特别是没有输入字段或者输入了表名table1.<empty>时,会导致语法解析错误恢复失败,最终的语法解析树无法作为实体收集的参考模板, 实体收集失败。

如在MySQL中输入SELECT t1. FROM t1,没法收集到任何实体,那么就没法去提供对应的补全
image

我们已在 grammar 中对 columnName 添加了{this.shouldMatchEmpty()}? 语义谓词分支的支持,使在实体收集时能够匹配空字段,但这仅能覆盖一小部分情况,当where、order by、join on等场景时(会匹配 expression )无效,并且输入点时也无法命中。

语义谓词:语义谓词是 ANTLR4 中将语法规则与自定义代码逻辑结合的核心机制,用于在语法解析过程中动态控制规则的匹配行为

一般来说 grammar 中将字段拆分成了 columnNamecolumnNamePath 两个规则, columnName在select item中匹配,columnNamePath 在表达式中匹配。

目前有两种方法利用语义谓词解决该问题:

方案一

columnNamePath也添加空字段的语义谓词分支,但该方式会导致比较多的语法校验单测失败(语义谓词分支会影响预测,即使该语义谓词分支没有匹配上),影响范围也没法确定。目前PGSQL尝试使用了该方式。

方案二

最小化改动原则(当前尝试的方式),在具体规则(where、 order by、 join on等)后添加语义谓词,基本不会导致现有的单测报错。

如果采用方案一,其表现效果很好,但需要深入分析下语义谓词对antlr4 预测、错误恢复等阶段的影响,避免其影响到非实体收集上下文时的功能。
如果采用方案二,需要处理很多不同 expression下的场景,如需要处理 join ... on t1.id = t2.id中 比较操作符=可能会嵌套递归的情况

使用语义谓词后语法解析树基本能保证完整
image

Preview地址 https://jackwang032.github.io/monaco-sql-languages/

mumiao and others added 15 commits November 26, 2025 15:28
…Offset (#426)

* test: #424 syntax after comments

* fix(common): #424 allTokens slice when caretTokenIndex use tokenIndexOffset
* test: #432 validate unComplete sql

* fix: #432 remove error rule
* feat: support queryResult and derived table entities collecting

* feat: support query result and derived table entity collecting

* test: enhance hive and spark entity collect test case

* fix: remove _ctx and add tokenIndex into position

* fix: rename declareType COMMON to LITERAL

* fix: optimize entity collector and update  grammar

* test: add derived table and query result entities test case

* fix: remove isCaretInDerivedTableStmt and set default isAccessible to null

* fix: update _caretStmt docs

* test: add isAccessible test case

* fix: skip _caretStmt ts check

* docs: update README to include additional entity information

* test: fix create view test case

* fix:  import from error sql module

* test: update entity collection tests

* fix: remove unused type
@JackWang032 JackWang032 marked this pull request as draft December 17, 2025 11:32
@JackWang032
Copy link
Copy Markdown
Collaborator Author

额外需要注意:非保留关键字可能会被识别为别名,需要做些特殊处理手段
image

@Cythia828 Cythia828 marked this pull request as ready for review March 24, 2026 03:12
@Cythia828 Cythia828 marked this pull request as draft March 24, 2026 03:13
@Cythia828 Cythia828 changed the base branch from next to feat/emptyColumnPrepare March 24, 2026 03:13
@Cythia828 Cythia828 marked this pull request as ready for review March 24, 2026 03:56
@Cythia828 Cythia828 marked this pull request as draft March 24, 2026 03:58
@Cythia828 Cythia828 marked this pull request as ready for review March 24, 2026 03:58
Cythia828 pushed a commit to Cythia828/dt-sql-parser that referenced this pull request Mar 24, 2026
- Add shouldMatchEmpty() method to SQLParserBase
- Add emptyColumn rule to PostgreSQL grammar
- Add exitTarget_empty method to entity collector
- Update grammar files to remove semantic predicates
- Update tests to expect empty column entities
- Regenerate all parser files

Fixes: DTStack#457
Cythia828 pushed a commit to Cythia828/dt-sql-parser that referenced this pull request Mar 24, 2026


- Update entityCollector.ts to keep empty column entities
- Add exitTarget_empty method to postgreEntityCollector.ts
- Update Hive and MySQL tests to expect empty column entities

Restores modifications lost during rebase.
@Cythia828 Cythia828 added the 5.29 label May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3 participants