Skip to content

fix comment handling #150

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

apstndb
Copy link
Collaborator

@apstndb apstndb commented Mar 23, 2023

This PR changes comment handling in the statement separator.
Comments are replaced by a single whitespace if the next character is not whitespaces.

It may be better to check previous character of the comment but it is not easy without changing structure.

fixes #149

@apstndb apstndb requested a review from yfuruyama March 23, 2023 12:36
Copy link
Collaborator

@yfuruyama yfuruyama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for finding this edge case! I have left a few comments.

separator.go Outdated

// no comment found
if terminate == "" {
// emit space if comments are found and the next character is not a whitespace
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have two questions.

  1. Initially, this was a little bit difficult for me to understand because the block under if found && i < len(s.str) { is executed in the 2nd iteration of the outer loop. Doesn't it work if we just put s.sb.WriteRune(' ') at here and here? Both lines are points where the comment termination found.
  2. It looks like the whitespace is not inserted if the next character is a whitespace. Do we need to care about whether the next character is a whitespace or not? Why not simply converting "a comment" to the "whitespace"?
Copy link
Collaborator Author

@apstndb apstndb Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to discuss about expected behavior as a design decision.

  • If we can think only token separation.
    • We can always put the whitespace.
  • If we need to concern about query text in query stats, audit logs, etc.
    • We may need to define some expected behaviors.

example1.
'SELECT 1# comment' → 'SELECT 1' or 'SELECT 1 '?
Need to consider about end of string?
This space will be always trimmed because of strings.TrimSpace

example2 from test case

# comment;
SELECT /* comment */ 1; --comment
SELECT 2;/* comment */

Can it be interpret as SELECT 1 and SELECT 2?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction: example1 will not be SELECT 1 because strings.TrimSpace is applied
https://github.com/cloudspannerecosystem/spanner-cli/blob/v0.9.16/separator.go#L290

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation is already stripping the comment from the query, so I think users won't expect the query is intact when writing a comment in the query string.

Having said that, it might be good idea to change the current implementation to keep the comment in query string. Then we can defer to the Spanner server side for the handling of comment characters in a query string. (It looks like Spanner keeps the comment characters in query stats.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.
Currently, Cloud Spanner rejects comments in DDL so comment strip is needed for DDLs,
but DML and SELECT statements can process comments and there are motivations to keep the comment.

Note: I have some experimental implementation of comment preserving separator which is derived on this statement separator.
https://github.com/apstndb/gsqlsep

Copy link
Collaborator Author

@apstndb apstndb Mar 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be a solution.

  1. Separate queries (preserving comments)
  2. Strip comments and detect kind of statement
  3. a. execute stripped DDL
    b. execute raw DML/SELECT
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! Is it possible to implement the solution in https://github.com/apstndb/gsqlsep and switch to use it from spanner-cli?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to PoC using gsqlsep.

@apstndb
Copy link
Collaborator Author

apstndb commented Mar 24, 2023

I have experimentally replaced statement separator with gsqlsep and implement comment preserving(only SelectStatement and DmlStatement)

Copy link
Collaborator

@yfuruyama yfuruyama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing with gsqlsep!

cli.go Outdated
@@ -328,13 +329,16 @@ func readInteractiveInput(rl *readline.Instance, prompt string) (*inputStatement
}
input += line + "\n"

statements := separateInput(input)
statements := gsqlsep.SeparateInputPreserveComments(input, `\G`)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
statements := gsqlsep.SeparateInputPreserveComments(input, `\G`)
statements := gsqlsep.SeparateInputPreserveComments(input, delimiterVertical)

might be better? Same for others.

cli.go Outdated
@@ -516,3 +520,7 @@ func handleInterrupt(cancel context.CancelFunc) {
<-c
cancel()
}

func stripComments(input string) string {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a random idea.

Random idea 1

What do you think about changing gsqlsep.InputStatement as follows,

type InputStatement struct {
	Statement                string
	StatementWithoutComment  string
	Terminator               string
}

and use StatementWithoutComment for input match and Statement for other use case? Then we don't need to call gsqlsep.Separate... functions in two difference places.

Random idea 2

What do you think about adding StripComments() method to gsqlsep.InputStatement?

return `\G`
}
return ""
delim string
}

func separateInput(input string) []inputStatement {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep this function, why not calling this function in other places? (not directly call gsqlsep functions)

@apstndb
Copy link
Collaborator Author

apstndb commented Mar 24, 2023

I added InputStatement.StripComments into gsqlsep and enhanced separateInput in spanner-cli.
gsqlsep is no longer directly referenced outside of separateInput.

@apstndb apstndb requested a review from yfuruyama March 24, 2023 13:27
separator.go Outdated
s.sb.Reset()
}
var result []inputStatement
for _, stmt := range gsqlsep.SeparateInputPreserveComments(input, `\G`) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
for _, stmt := range gsqlsep.SeparateInputPreserveComments(input, `\G`) {
for _, stmt := range gsqlsep.SeparateInputPreserveComments(input, delimiterVertical) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

separator.go Outdated
}
var result []inputStatement
for _, stmt := range gsqlsep.SeparateInputPreserveComments(input, `\G`) {
stmt := stmt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason we need to capture this in a local variable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is needed.

@apstndb apstndb requested a review from yfuruyama March 27, 2023 04:49
Copy link
Collaborator

@yfuruyama yfuruyama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@yfuruyama yfuruyama merged commit a45da2a into cloudspannerecosystem:master Mar 27, 2023
@apstndb apstndb deleted the fix-comment-as-separator branch March 27, 2023 05:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants