The Wayback Machine - https://web.archive.org/web/20201106004137/https://github.com/SeanLee97/xmnlp/issues/17
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于特殊名词 #17

Open
dwbaron opened this issue Apr 28, 2019 · 5 comments
Open

关于特殊名词 #17

dwbaron opened this issue Apr 28, 2019 · 5 comments

Comments

@dwbaron
Copy link

@dwbaron dwbaron commented Apr 28, 2019

类似债券简称,比如“02进出04”,特殊名词比如“5G”,我发现在分词的时候会打散

@SeanLee97
Copy link
Owner

@SeanLee97 SeanLee97 commented Apr 28, 2019

Thanks for your suggestions, we will fix the problem that can not detect proper noun formed by digital and English alphabet.

@dwbaron
Copy link
Author

@dwbaron dwbaron commented Apr 28, 2019

I tried to combined trie (which perform the exactly match) and hmm seg to fix such problem temporarily.

@dwbaron
Copy link
Author

@dwbaron dwbaron commented Apr 28, 2019

it seems that u first use zh-char to split the sentence, use eng-char seems better?

@dwbaron
Copy link
Author

@dwbaron dwbaron commented May 5, 2019

image
I try to figure out this en-char problems follow my above solution

@SeanLee97
Copy link
Owner

@SeanLee97 SeanLee97 commented May 5, 2019

Thanks for your suggestions! I will fix the problem when I free. If you are willing to give contributions to this repo, you can create a PR! Look forward to your contributions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
2 participants
You can’t perform that action at this time.