The Wayback Machine - https://web.archive.org/web/20210923091433/https://github.com/MonoGame/MonoGame/issues/7351
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LocalizedFontProcessor extremely poor performance #7351

Open
jamesford42 opened this issue Sep 8, 2020 · 15 comments · May be fixed by #7464
Open

LocalizedFontProcessor extremely poor performance #7351

jamesford42 opened this issue Sep 8, 2020 · 15 comments · May be fixed by #7464

Comments

@jamesford42
Copy link
Contributor

@jamesford42 jamesford42 commented Sep 8, 2020

LocalizedFontProcessor tends to have a very large number of glyphs, and its performance tanks terribly as the number increases. Actually I believe what needs to be optimized is the glyph arrangement algorithm in the base type FontDescriptionProcessor, which currently has performance like or worse than N^N.

What version of MonoGame does the bug occur on:

  • Console branch

What operating system are you using:

  • Windows

What MonoGame platform are you using:

@enkeyz
Copy link

@enkeyz enkeyz commented Feb 6, 2021

There are 3 nested foreach in one of the method in LocalizedFontProcessor.cs :) That's why. It's an easy fix. Would like to help, but my C# is rusty.

@TechnologicalPizza
Copy link
Contributor

@TechnologicalPizza TechnologicalPizza commented Feb 6, 2021

CharacterCollection (which pretty much just wraps a list, nothing else) uses a List instead of a HashSet.
There are plenty more nitpicks all over font processing though.
Someone should profile the content builder and show some hot paths for a good start.

@enkeyz
Copy link

@enkeyz enkeyz commented Feb 7, 2021

So I taken a deeper look at LocalizedFontProcessor.cs . It would make more sense to cache the content of .resx file in a HashSet of XMLDocuments, instead of constructing new XMLDocuments in foreach loop, whenever "Process" method is called.

So instead of O(n3), we get O(n2) time complexity with O(n) space.

What do you think @tomspilman ?

@jamesford42
Copy link
Contributor Author

@jamesford42 jamesford42 commented Feb 9, 2021

Actually i think for me the hot-path was trying to place a sprite for each glyph doing a linear loop of every other sprite in the sheet to test for overlap/collision, and then offsetting by a single pixel and repeating -- this gets very very slow when you have thousands of glyphs.

The optimizations i had in mind are ...

  1. Don't use the current placement system based on collision testing at all, just sort all the sprites by height first, then place them all in rows.
  2. If we are going to use the existing system, optimize collision testing with a spatial partitioning system.
@mrhelmut
Copy link
Contributor

@mrhelmut mrhelmut commented Feb 9, 2021

Any optimization would be welcome here.

Have you run a profiler over this? I would definitely assume CharacterCollection to be a part of the problem, I may be wrong but if I had to take a guess, I would say that time is also largely lost in parsing the XML on xmlDocument.Load(absolutePath); and foreach (XmlNode xmlNode in xmlDocument.SelectNodes("root/data/value")).

@TechnologicalPizza
Copy link
Contributor

@TechnologicalPizza TechnologicalPizza commented Feb 9, 2021

FindIntersectingGlyph is absolurely the biggest time hog here.
image
(Arial 12px, ranges [32..126] and [19968..20992] = 1120 glyphs)

I plan to make a PR for this now. A simple quad tree could speed this up immensly.

@Jjagg
Copy link
Contributor

@Jjagg Jjagg commented Feb 9, 2021

This is a rectangle packing problem.

Here's a great paper on it: http://pds25.egloos.com/pds/201504/21/98/RectangleBinPack.pdf.

A good algorithm for this problem is called MaxRects. I have a C# port of it here. Feel free to copy and adapt it as required.

@TechnologicalPizza TechnologicalPizza linked a pull request that will close this issue Feb 9, 2021
@TechnologicalPizza
Copy link
Contributor

@TechnologicalPizza TechnologicalPizza commented Feb 9, 2021

image
I don't know what kind of magic that rect packer is using but it's sure working 🌪️

Old
image

New
image

@TechnologicalPizza
Copy link
Contributor

@TechnologicalPizza TechnologicalPizza commented Feb 9, 2021

The next improvement would be recognizing missing glyphs and skipping them/reusing one "missing glyph" for every other.

@jamesford42
Copy link
Contributor Author

@jamesford42 jamesford42 commented Feb 9, 2021

What's the performance difference? If you want a really huge .resx example i could email you one privately.

Also, you might want to check if the new algorithm is deterministic. That is, if you run it with the same inputs a second time, it outputs the same (read: identical) sprite sheet. That's important for games that version assets, ship patches, etc.

@TechnologicalPizza
Copy link
Contributor

@TechnologicalPizza TechnologicalPizza commented Feb 9, 2021

The performance on my 1120 char sample was 33x faster though it may be a slightly less efficient at packing.
It should absolutely be deterministic though.
@jamesford42 And you sure can send that .resx example.

@jamesford42
Copy link
Contributor Author

@jamesford42 jamesford42 commented Feb 9, 2021

@TechnologicalPizza My email is visible on my account, but yours is not, can you contact me?

@TechnologicalPizza
Copy link
Contributor

@TechnologicalPizza TechnologicalPizza commented Feb 9, 2021

Surprisingly enough tweaking how the rect packer grew made it pack exactly like the original implementaiton.
image

@TechnologicalPizza
Copy link
Contributor

@TechnologicalPizza TechnologicalPizza commented Feb 9, 2021

And the PR #7464 is practically done, now it's just to wait for review.
Would also be nice if others (@jamesford42) tested that PR in particular to avoid merging something broken.

@stromkos
Copy link
Contributor

@stromkos stromkos commented Mar 11, 2021

I took a shot at this problem and came up with a near linear time and space Glyph packer. It follows the kiss principal.

You can find the branch here: develop...stromkos:GlyphPackerImprovements

I am sure it is not as tightly packed as other implementations, and may be slower on smaller glyph counts: the above range took 539ms compared to 287ms(Stock 3.8).

It does two passes the first to calculate the output size (512 x ???) and the second is to place using the square root of the calculated size as width.

All of the drawbacks above notwithstanding, this code produced a spritefont XNB file containing all 41354 unique glyphs in
the BMP plane of "Noto Sans CJK JP" , at size 24 with a texture size of 6280x6156 in 22 seconds. (from .mgstats: Size 41616806, Seconds 21.603508)
With a size of 65, the largest below 16384 x 16384, it ran in just over 2 minutes.
Note: anything above 16384 * 8191 elements will fail to build due to int becoming negative, and exceeding Byte[] array capacity.

The change to Texture2D.cs allows loading of compressed files up to 168384 x 16383 (Assuming 16 to 1 compression ratio like DXT3).

The test range from previous posts: [32..126] and [19968..20992]
Arial 12 64584, 0.5542461 128 x 88
a
The image is an Alpha8 to allow texture.SaveAsPng().

Stock: Arial 12 200785, 5.2288785 512 x 288
Stock with 1 Unknown: 18513, 0.2879005 128 x 104
Same range 16456, 0.5392444 128 x 88

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment