2

I know that the most basic syntax to horizontally stack two videos side-by-side is using the following syntax:

ffmpeg -i left.mp4 -i right.mp4 -filter_complex hstack sideByside.mp4

How, if possible, to do this using existing hardware to accelerate the process?

In other word, how to achieve this horizontal stack using h264_vaapi encoder like this:

ffmpeg -i left.mp4 -i right.mp4 -filter_complex hstack -c:v h264_vaapi sideBySide.mp4

thanks in advance

1 Answer 1

1

When using the VA-API hstack filter, you have to put your video frames into VRAM. Normally, with the default software decoder, the video frames aren't uploaded to the GPU's memory/VRAM, but the computer's RAM, so VA-API can't access them from the GPU, and it won't work.

There are two ways of doing this. First, and possibly the "easiest" is to use the hwupload keyword/filter to load the video stream into GPU memory, after which you can then run the hstack_vaapi filter on them.

So your filter_complex might look a little something like this: -filter_complex [0:v]format=nv12,hwupload[0v],[1:v]format=nv12,hwupload[1v],[0v][1v]hstack_vaapi, where [0:v] represents the video stream in the first video (left.mp4), and [1:v] the video stream in the second video (right.mp4).

format=nv12 converts the video streams to the nv12 pixel format, which is one of the formats that GPUs tend to support in hardware, and [0v],[1v] are basically aliases for the output of those streams. You need hwupload, since the video streams are loaded into system RAM for those processes, and you need them to be put into the GPU's VRAM for the GPU to work on them (hence, upload).

The other way is just to decode using the GPU/hardware acceleration using VA-API (-hwaccel vaapi), which might do more or less the same thing in the background. You can then tell ffmpeg to upload those frames to GPU memory in VA-API's internal format with -hwaccel_output_format vaapi, which could then just fed straight to hstack_vaapi.

Something to note, though, is that the hardware-accelerated hstack filters are incredibly picky. You have to make sure that the video streams being fed into hstack_vaapi, for instance, have similar metadata (colourspace, time-base, etc), or else it won't work, with an oft-inexplicable error like Impossible to convert between the formats supplied by the filter 'graph 0 input from stream 1:0' and the filter 'auto_scale_0'.

This happens because ffmpeg automatically uses a scale filter to automatically convert the video streams so that they match (if you have it spit out verbose output, it will mention auto-inserting filter 'auto_scale_0' between the filter 'graph 0 input in stream 1:0' and the filter 'Parsed_hstack_vaapi_0'). This is all fine and dandy when you're using the software filter, since it can take the converted software frames (the ones in RAM), and then use them, but the hardware/GPU-accelerated filter can't use those frames, so ffmpeg's auto-conversion will fail.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.