Skip to content

Non-ASCII character support#45736

Merged
nodejs-github-bot merged 1 commit into
nodejs:mainfrom
mertcanaltin:bug-45706
Feb 18, 2023
Merged

Non-ASCII character support#45736
nodejs-github-bot merged 1 commit into
nodejs:mainfrom
mertcanaltin:bug-45706

Conversation

@mertcanaltin
Copy link
Copy Markdown
Member

@mertcanaltin mertcanaltin commented Dec 4, 2022

Fixes #45706
Fixes #46508

  • Is the character's ASCII code greater than 127? (In this case, the function returns true)
  • Is the character a number? (In this case, the function returns false)
  • Is the character a "#" symbol? (In this case, the function returns false)
  • If no conditions are met, the function returns true.
@nodejs-github-bot nodejs-github-bot added dont-land-on-v14.x needs-ci PRs that need a full CI run. test_runner Issues and PRs related to the test runner subsystem. labels Dec 4, 2022
@mertcanaltin
Copy link
Copy Markdown
Member Author

mertcanaltin commented Dec 4, 2022

test

 const  isLiteralSymbol = (char) => {
    const code = char.charCodeAt(0);

    if (code > 127) {
      return true;
    }

    if (char >= '0' && char <= 9) {
      return false;
    }

    if (code === 35) {
      return false;
    }

    return true;
  }

output

isLiteralSymbol('أهلا')
true
Comment thread lib/internal/test_runner/tap_lexer.js Outdated
Comment thread lib/internal/test_runner/tap_lexer.js Outdated
Comment thread test/parallel/test-runner-tap-lexer.js Outdated
@anonrig anonrig added the commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. label Dec 4, 2022
@mertcanaltin
Copy link
Copy Markdown
Member Author

@anonrig I applied the changes, thank you very much for the review 🚀

@mertcanaltin mertcanaltin requested a review from anonrig December 4, 2022 22:36
Comment thread test/parallel/test-runner-tap-lexer.js Outdated
Comment thread lib/internal/test_runner/tap_lexer.js Outdated
Comment thread lib/internal/test_runner/tap_lexer.js Outdated
Comment thread lib/internal/test_runner/tap_lexer.js Outdated
Comment thread lib/internal/test_runner/tap_lexer.js Outdated
@anonrig anonrig requested a review from MoLow December 4, 2022 23:03
@anonrig
Copy link
Copy Markdown
Member

anonrig commented Dec 4, 2022

@manekinekko Can you review this?

Copy link
Copy Markdown
Member

@MoLow MoLow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the change LGTM, but this seems to break quite a few tests

Comment thread lib/internal/test_runner/tap_lexer.js Outdated
Comment thread test/parallel/test-runner-tap-lexer.js Outdated
assert.strictEqual(isLiteralSymbol('#'), false);
assert.strictEqual(isLiteralSymbol('أ'), true);
assert.strictEqual(isLiteralSymbol('ت'), true);
assert.strictEqual(isLiteralSymbol('ث'), true);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you:

  1. group these into 2 groups: literals (true) and non-literals (false)?
  2. add more samples of non-Latin characters? you can cherry-pick from this list here.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manekinekko thank you so much i will fly here 🚀

Copy link
Copy Markdown
Member Author

@mertcanaltin mertcanaltin Dec 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@manekinekko would that be healthy

{
  const literals = ['A', 'a', '-', '+', 'أ', 'ت', 'ث', '讲', '演', '講'];
  const nonLiterals = ['0', '#', '\\', '+', '-'];
  
  literals.forEach((literal) => {
  assert.strictEqual(isLiteralSymbol(literal), true);
  });
  
  nonLiterals.forEach((nonLiteral) => {
  assert.strictEqual(isLiteralSymbol(nonLiteral), false);
  });
}
@mertcanaltin mertcanaltin requested review from MoLow, anonrig and manekinekko and removed request for MoLow, anonrig and manekinekko December 5, 2022 14:59
@mertcanaltin
Copy link
Copy Markdown
Member Author

I'm so sorry I triggered you all to review, sorry for the extra notifications :/

Comment thread lib/internal/test_runner/tap_lexer.js Outdated
Comment thread lib/internal/test_runner/tap_lexer.js Outdated
Comment thread test/parallel/test-runner-tap-lexer.js
@mertcanaltin
Copy link
Copy Markdown
Member Author

@manekinekko I sent the edits thank you very much

Copy link
Copy Markdown
Member

@BridgeAR BridgeAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we really accept zero width characters as acceptable input? I would skip all of them. We also already have a function to check for these:

const isZeroWidthCodePoint = (code) => {
return code <= 0x1F || // C0 control codes
(code >= 0x7F && code <= 0x9F) || // C1 control codes
(code >= 0x300 && code <= 0x36F) || // Combining Diacritical Marks
(code >= 0x200B && code <= 0x200F) || // Modifying Invisible Characters
// Combining Diacritical Marks for Symbols
(code >= 0x20D0 && code <= 0x20FF) ||
(code >= 0xFE00 && code <= 0xFE0F) || // Variation Selectors
(code >= 0xFE20 && code <= 0xFE2F) || // Combining Half Marks
(code >= 0xE0100 && code <= 0xE01EF); // Variation Selectors
};

@mertcanaltin
Copy link
Copy Markdown
Member Author

mertcanaltin commented Dec 7, 2022

I didn't accept zero-width characters @BridgeAR i will update the code like this

  if (typeof char !== 'string' || util.inspect.isZeroWidthCodePoint(char)) {
      return false;
    }

can I do it like this?

@mertcanaltin mertcanaltin reopened this Dec 7, 2022
@cjihrig
Copy link
Copy Markdown
Contributor

cjihrig commented Dec 12, 2022

@mertcanaltin I think that would be ok.

@MoLow
Copy link
Copy Markdown
Member

MoLow commented Feb 8, 2023

CC @nodejs/test_runner @manekinekko additional reviews will be appreciated

@MoLow
Copy link
Copy Markdown
Member

MoLow commented Feb 8, 2023

@anonrig can you please dismiss your review/approve?

@MoLow MoLow removed the commit-queue-squash Add this label to instruct the Commit Queue to squash all the PR commits into the first one. label Feb 8, 2023
@MoLow MoLow added author ready PRs that have at least one approval, no pending requests for changes, and a CI started. and removed author ready PRs that have at least one approval, no pending requests for changes, and a CI started. labels Feb 9, 2023
@MoLow MoLow added the request-ci Add this label to start a Jenkins CI on a PR. label Feb 18, 2023
@github-actions github-actions Bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Feb 18, 2023
@MoLow MoLow added the commit-queue Add this label to land a pull request using GitHub Actions. label Feb 18, 2023
@nodejs-github-bot nodejs-github-bot removed the commit-queue Add this label to land a pull request using GitHub Actions. label Feb 18, 2023
@nodejs-github-bot nodejs-github-bot merged commit 3c6547f into nodejs:main Feb 18, 2023
@nodejs-github-bot
Copy link
Copy Markdown
Collaborator

Landed in 3c6547f

@mertcanaltin mertcanaltin changed the title [WIP] Non-ASCII character support Non-ASCII character support Feb 18, 2023
@mertcanaltin
Copy link
Copy Markdown
Member Author

🎉

MoLow pushed a commit to MoLow/node that referenced this pull request Feb 18, 2023
PR-URL: nodejs#45736
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
MylesBorins pushed a commit that referenced this pull request Feb 18, 2023
PR-URL: #45736
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
@MylesBorins MylesBorins mentioned this pull request Feb 19, 2023
MylesBorins pushed a commit that referenced this pull request Feb 20, 2023
PR-URL: #45736
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
MoLow pushed a commit to MoLow/node that referenced this pull request Feb 25, 2023
PR-URL: nodejs#45736
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
MoLow pushed a commit to MoLow/node that referenced this pull request Feb 25, 2023
PR-URL: nodejs#45736
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
MoLow pushed a commit to MoLow/node that referenced this pull request Feb 25, 2023
PR-URL: nodejs#45736
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
juanarbol pushed a commit that referenced this pull request Mar 3, 2023
PR-URL: #45736
Backport-PR-URL: #46839
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
@juanarbol juanarbol mentioned this pull request Mar 3, 2023
juanarbol pushed a commit that referenced this pull request Mar 5, 2023
PR-URL: #45736
Backport-PR-URL: #46839
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Moshe Atlow <moshe@atlow.co.il>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-ci PRs that need a full CI run. test_runner Issues and PRs related to the test runner subsystem.

9 participants