The Wayback Machine - https://web.archive.org/web/20200906232228/https://github.com/mltframework/mlt/issues/499
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sliced_h_pix_fmt_conv_proc using needs performance improvement? #499

Closed
faridosc opened this issue Nov 10, 2019 · 1 comment
Closed

sliced_h_pix_fmt_conv_proc using needs performance improvement? #499

faridosc opened this issue Nov 10, 2019 · 1 comment

Comments

@faridosc
Copy link

@faridosc faridosc commented Nov 10, 2019

Hotspot is reporting a high cpu usage in line 1322 of producer_avformat.c:

sws_scale( sws, in, in_stride, 0, h, out, out_stride );

The caller is mlt_slices_worker and the callee is sws_scale. This seems to happen in both Kdenlive and Shotcut.

image

@ddennedy
Copy link
Member

@ddennedy ddennedy commented Nov 10, 2019

Most source video is 8-bit YUV 4:2:0, but that is not a good format for image processing due to sub-sampled chroma. Almost no routines support processing on that (not frei0r or Qt, only sometimes libavfilter). One way to avoid this is to do more processing in 8-bit YUV 4:2:0, but the new standard in video is 10-bit HDR. People can strive to port routines to be more pixel format independent, which is very difficult to do while also optimizing performance. If you want to improve performance of libswscale, then ask FFmpeg, and good luck as it is already quite optimized. If you want to use GPU and OpenColorIO for pixel format conversion, well, that is an enhancement request. I hope to spend a fair amount of time in 2020 to improve image processing performance. OpenColorIO might be a part of that depending on the rest of the pipeline running on GPU to reduce the costly transfer of heavy image data transfer between CPU and GPU memory. When profiling, it is often more useful to look for surprises and not the areas that you already know are costly. For example, should sws_init_context() be as high as it is?

@ddennedy ddennedy closed this Jan 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
2 participants
You can’t perform that action at this time.