Redesign SymbolStore and SymbolProvider #672

as-cii · 2016-02-15T16:23:32Z

This PR changes SymbolStore and SymbolProvider so that they're faster when retrieving suggestions and when computing symbols.

Before

The way those two objects previously worked relied on storing symbols and the buffer rows they were contained in: this was a problem in a number of ways, especially after buffer changes where we had to scan through all those symbols to adjust their position according to the aforementioned change (in addition, we were doing this synchronously, blocking the UI thread until we scanned all the symbols).

(~ 350ms, benchmarks performed by writing 3 letters with 500 cursors)

Also, searching into SymbolStore involved a series of queries to selectorMatchesScopeChain which caused some major slowdowns every time SymbolProvider.prototype.getSuggestions() got called.

(~ 150ms, benchmarks performed by searching for cor in benchmark/large.js)

After

With this PR we're changing the way we store symbols (avoiding to index them by buffer row) and computing/scoring suggestions (e.g. locality score, fuzzy score, etc.) as we traverse through lines. This allows us to perform a very minimal bookkeeping on each buffer change while still being fast when retrieving possible autocompletions. Moreover, we now make use of TextBuffer.prototype.onDidStopChanging, so that writing one or ten letters in a row doesn't cause the main thread to stall.

This is how symbols re-computation looks like after these changes:

(~ 46ms, 8x faster 🐎)

When storing tokens, we keep an index of symbols by letter for each buffer row. When retrieving them, we exploit the fact that a symbol needs to start with the prefix's first letter and, therefore, we score only tokens that match that first character.

(~ 53ms, 3x faster 🐎)

Storing that index on a per-row basis has the advantage of not requiring us to deal with the "moving nature" of buffer rows, but forces us to iterate over every single line. Ideally here we may want to build a smarter data structure (i.e. a tree) that indexes letters across rows; on the other hand, though, the array-based implementation has pretty decent performance, and I believe we can avoid the added complexity of introducing an extra data structure for the time being.

Conclusion

Overall, this should allow us to take autocomplete-plus out of the multi-cursor slowness equation. I still feel like there's room for improvement (e.g. see the across rows index mentioned in the previous section) but I think this should allow us to move forward and deal with other bottlenecks in packages and tokenization.

/cc: @nathansobo @maxbrunsfeld @benogle for 👀

This reverts commit c164fc1.

This allows us to skip a lot of symbols, thus making the iteration way faster. Moreover, we can keep splicing into the SymbolStore fast because as soon as lines get removed, the index is automatically kept in-sync (e.g. because it's stored per each line). On a file with ~ 26k lines, autocompleting a word takes < 80ms.

as-cii · 2016-02-15T17:36:46Z

lib/symbol-provider.coffee

 {Selector} = require 'selector-kit'
 SymbolStore = require './symbol-store'

+# TODO: remove this when `onDidStopChanging(changes)` ships on stable.
+bufferSupportsStopChanging = -> typeof TextBuffer::onDidChangeText is "function"


I am testing for the presence of onDidChangeText here because I couldn't find any other efficient heuristic to determine whether or not atom/text-buffer#126 shipped. This is temporary, though, so it might not be that important.

Redesign SymbolStore and SymbolProvider

as-cii added 16 commits Feb 12, 2016

WIP: Start refactoring SymbolProvider

c164fc1

Add an alternative SymbolStore

3a1d07e

Revert "WIP: Start refactoring SymbolProvider"

8ba44a4

This reverts commit c164fc1.

Implement counts by word

9e5db56

Implement completion from all buffers

77b3802

Restrict query to symbols with matching first letter

3be5e2d

Implement locality

985712b

🐎 Use new SymbolStore in SymbolProvider

bc4caa5

Test SymbolProvider through getSuggestions

43286d2

🎨

c487915

Fix SymbolStore spec and deprecate old store

ac144de

🐎 Recompute tokens when buffer stops changing

3bfcb8f

🎨

f9370cd

Use extents instead of ranges

3889311

💚 Make the change backwards-compatible

dc03ad5

as-cii reviewed Feb 15, 2016
View changes

🎨 Fix typo

d3c0813

as-cii added a commit that referenced this pull request Feb 21, 2016

Merge pull request #672 from atom/as-redesign-symbol-store-provider

f0602b1

Redesign SymbolStore and SymbolProvider

as-cii merged commit f0602b1 into master Feb 21, 2016
0 of 3 checks passed

as-cii deleted the as-redesign-symbol-store-provider branch Feb 21, 2016

as-cii mentioned this pull request Feb 24, 2016

SymbolStore.adjustBufferRows takes too much time in large files #452

Closed

benogle mentioned this pull request Feb 26, 2016

Atom hangs for ~30s when pasting a large text block into a new tab. atom/atom#10855

Closed

Nov	DEC	Jan
	07
2021	2022	2023

Redesign SymbolStore and SymbolProvider #672

Redesign SymbolStore and SymbolProvider #672

as-cii commented Feb 15, 2016

as-cii Feb 15, 2016

Redesign SymbolStore and SymbolProvider #672

Redesign SymbolStore and SymbolProvider #672

Conversation

as-cii commented Feb 15, 2016

Before

After

Conclusion

as-cii Feb 15, 2016

Choose a reason for hiding this comment