RetroArch – Hardware video decoding – coming soon!

As you may well know, RetroArch has embedded video player support on platforms such as Windows, and Linux. Just like VLC, Kodi, mpv and other video players out there, it accomplishes this by leveraging the ffmpeg project.

Up until now, all video decoding was performed entirely in software. This means that the CPU has to do all the decoding instead of being able to delegate it to the GPU. This meant that on some systems, video playback could be too slow if the CPU was too underpowered. This so happens to be the case on many ARM SoC devices out there, such as the Raspberry Pi and Odroids.

Now, we finally support hardware video decoding through ffmpeg’s own APIs! This should really help on systems where there is a CPU bottleneck and the GPU happens to support hardware decoding. Whether or not you are able to decode 1080p, 1440p or 4K on hardware depends entirely on your GPU’s capabilities however.

In addition to hardware decoding, frame based multithreading is now enabled for SW based video decoders, but actual effectiveness hasn’t been proven yet.

The core switches back to SW based decoding if the HW based decoding couldn’t be initialized.

The following backends have been tested:

  • DXVA2 [Windows]
  • D3D11VA [Windows] (it will use this when using the D3D11 driver
  • VDPAU [Linux] (Tested on an AMD System with VDPAU to VAAPI layer)
  • VAAPI

We have performed the following tests so far:

  • Nvidia Titan XP/RTX 2080 Ti
  • – Can hardware decode 1080p/1440p/4K content.

  • Intel UHD 630
  • – Can hardware decode 1080p/1440p/4K content.

  • AMD Radeon R9 290x
  • – This is a slightly older card from 2014. It only supports 1080p hardware video decoding at best. 1440p and 4K content therefore falls back to software video decoding. This means that if your CPU is not up to the task, you won’t be able to run this content at fullspeed.

As a stress test video, we picked a 4K video (3840×2160) with a total bitrate of 29561 kb/s (h264/AVC1, YUV420P), running at 30 frames per second. The CPU we’re using for this test is an Intel Core i7 7700k. With such a CPU, we don’t really have a CPU bottleneck and we are merely GPU bound when it comes to rendering the content.

With software decoding (the current default in RetroArch) – we averaged around 55fps with the 2080 Ti. Our CPU load averages around 15% with GPU load averaging around 11%.

With hardware decoding (the 2080 Ti defaults to DXVA2 for this test) – we averaged 77fps with the 2080 Ti. Our CPU load averages around 11% with GPU load averaging around 20%.

NOTE: The above is long since out of date – the same video is now 256fps with hardware decoding and 224fps with threaded video decoding at an automatically defined amount of threads. Quite the improvement from 55fps I’m sure you’ll agree.

What remains to be done

We will still need to gather tests for the following backends:

  • Cuda
  • Videotoolbox
  • DRM
  • OpenCL
  • Mediacodec

Future plans

In short, we hope this will really help out RetroArch’s video playback capabilities not only on desktops such as Windows and Linux, but also on the ARM SoCs, and in specific our own Linux distribution, Lakka.

But hardware video decoding is not the end-all-be all. There is certainly a lot of room for improvement for future speedups, and these are being investigated. But that’s the subject of another blog post somewhere down the line.

For now, rest assured that big things are coming up for the next version of RetroArch!

RetroArch – Manual content scanning coming very soon!

One of the most requested features for RetroArch, manual content scanning, will finally be added!
How it will work compared to the current database-based scanning:

  • You select a content directory
  • You specify a ‘system name’ to be used as the playlist & database names, with the option of automatically using the content directory name, or a custom string (or any standard database name)
  • You optionally specify a default core for the resultant playlist (if selected, content will be filtered by supported extensions)
  • You can optionally specify a manual list of file extensions to include (so if you have a folder with bin/cue files, you can just enter ‘cue’, and skip the bins)
  • You can either scan content to a new playlist, or just add missing content to an existing playlist (i.e. so you can use it to pick up ‘leftovers’ if you did a normal scan and have items that didn’t match the database)

In essence, this is kinda like the Qt playlist builder thing, but it works everywhere, and all platforms

It’s purely file based – doesn’t scan the databases – so users will get playlists containing all their ROMs. This has always been the biggest complaint about RetroArch – i.e. “I scanned my games and stuff is missing – what can I do?”

This and more coming up in RetroArch 1.8.2!