R55 – Audio support and improved performance

After years of development the next major update to VapourSynth is ready. Finally Avisynth(+) is truly obsolete and the remaining 5 audio users can safely migrate.

Audio

Despite being a fairly big feature there isn’t that much to say about audio. In addition to video nodes there are now audio nodes. They mostly work like video nodes. Simple, right? So instead I’ll list the biggest points.

  • Audio output with video in VFW and AVFS can be done with audionode.set_output(1). Note that these always use index 0 for video and 1 as audio source.
  • Frame properties are not really a thing due to audio frames in VapourSynth being an arbitrary lump of samples. Generally you shouldn’t use them.
  • Audio can’t have variable sample rate or format. Almost nobody uses variable format in video and it keeps the complexity down.

Performance

A lot of work has gone into improving performance and scripts like QTGMC can run up to 10% faster on modern CPUs. The memory usage has also been greatly reduced and a scripts like QTGMC (once again) will usually no longer reach the upper limit and warn about memory usage. Note that as a general rule script memory usage decreases with the number of plugins used that are updated to the new API version.

Compatibility and breaking changes

As usual with all big updates there are breaking changes to clean up past mistakes and remove unused features. While the core itself is more or less fully compatible with the old API a few features were removed which can cause issues with less than 1% of all plugins.

  • YCoCg – never became popular and everything else treats it as degenerate YUV so now we do too)
  • COMPAT formats – existed to make loading Avisynth plugins possible but in the 9 years or so since the first release Avisynth+ has appeared and just about all Avisynth filters support planar formats so it’s pointless and adds complexity to core functions)
  • Many script functions were marked as obsolete or removed. The most common construct is probably vs.get_core() that no longer works. Use vs.core instead you lazy script writers! Many other functions like plugin/function enumeration were changed but the old functions simply have deprecation warnings. For now.
  • Many plugins are no longer bundled/in the source tree. If your script needs subtext or histogram you have to install them yourself separately. The easiest way to do this is with vsrepo on windows.
  • The alpha handling has changed and attaching the alpha as an additional frame is now the preferred method.

What this means in practice is that 99% of all plugins works properly. The only known ones that don’t work well are FFMS2 and IMWRI when outputting alpha. Update your plugins and there shouldn’t be any surprises.

For scripts it’s a bit more complicated. For example all scripts by pedantic internet users (most public scripts that is) had YCoCg support which would error out in the new version. Fortunately just about all scripts have been updated to work on both old and new VapourSynth versions. Update your scripts and there shouldn’t be any surprises here either.

Where things seem to have broken a lot is however applications that use VSScript like VSEdit. Apparently all of them used the COMPATBGR32 format because it’s convenient for output. UPDATE NOW!

R55 – Start of the API3 builds

Starting from today there will separate builds of the new audio branch (master) and the old (api3) branch provided for those who need full compatibility with old scripts.

The biggest fixes in this release is that the handling of pow in expr has been fixed and no longer produces unexpected output. There’s also a new vsrepo included that once again works. That’s about it. It’s the life of a maintenance branch where only super serious things will be fixed.

R54 – Mask clips are special

Turns out floating point masks are hard. Or at least not completely intuitive so you have to make up clear rules for them. The problem solved in this version is that some filters output masks either in the 0 to 1 or -0.5 to 0.5 range for UV planes. And in turn filters that consume masks made expected one or the other. Mismatched assumptions would obviously yield garbage output or a lot of pointless offsetting using Expr.

To solve this the concept of “mask clips” have been introduced. No matter the color space they are always supposed to have a 0-1 range in floating point formats and all filters that output and consume this style of masks have been documented. MaskedMerge is probably the best example of this. Since the Invert and Binarize filters conceptually work in images and not masks (despite often being used for this) new versions called InvertMask and BinarizeMask have been introduced that will have the expected behavior in all cases.

R53 – Once more there are bug fixes!

If you’re wondering why it’s taken so long since the previous release it’s partly due to R52 having surprisingly few bugs. You could say that I’m a too good coder (on occasion). The new R53 release mainly exists to add Python 3.9 support to windows and apply a few contributes bugfixes.

R50 – Low risk release

R50 is a pure regression fix release and has nothing new of interest. Install it now!

R51 will probably have lots of changes merge and maybe there’ll be a new audio test build soon too.

R49 – Just another release

There’s a new release that fixes all known R48 regressions and bugs. Not much else to say about it really. One notable thing is that I got myself a raspberry pi recently so now I can actually test compilation on ARM easily and it won’t be broken all the time. I guess that’s interesting for some people.

Audio Support and how it works

Finally the first test version of audio support is ready and this post will describe how it works. Spoiler: It doesn’t work like in Avisynth.

In Python scripts audio nodes are just another unique type. They aren’t stuck together with a video track like in Avisynth. This can be both an advantage and a disadvantage, for example cutting audio and video at the same time by frame numbers will currently require a bit of user scripting. On the other hand with the correct helper functions it’ll be possible to manipulate multiple audio and video tracks as a group easily. Audio also always has a constant format unlike video.

Feedback and downloads are on the Doom9 forum as usual for highly experimental things.

Audio filters

  • BestAudioSource – a new sample accurate but somewhat slow FFmpeg based source filter (usage: core.bas.Source(“rule6.mp4”))
  • BlankAudio – a classic
  • AudioSplice and AudioTrim – with the expected Python overloads of course

Output support

  • VSPipe – outputs raw pcm audio and using the -y switch adds wave64 headers
  • AVFS – uses the audio node assigned to output slot 1
  • VFW – uses the audio node assigned to output slot 1 and video must be assigned to slot 0

An output example

import vapoursynth as vs
audio = vs.core.bas.Source("somefile.mp3", track=-1)
video = vs.core.std.BlankClip()
video.set_output(0)
audio.set_output(1)

API Notes and Changes

Currently the API isn’t completely stable but only minor changes are expected at this point so starting to port plugins. The only mildly breaking change is that the clip type in function argument strings now has been renamed to vnode and obviously the new anode type has been introduced which may confuse existing software tries to parse the argument strings.

The best way to think of audio nodes is as a type completely separate from video nodes that only happen to share some functions to manipulate them. For example you can’t mix audio and video nodes in the same VSMap key or in any other context.

R48 – AVX2 Instrinsics for Everyone!

After several weeks of testing R48 is finally done. As you may suspect from the title the biggest change this time is optimizations. Now most internal functions have proper AVX2 optimizations and in addition to that the Expr filter was greatly improved and can now rewrite and optimize expressions much better. In addition to that a lot of bugs were fixed and the installer got a few more options. Users of R47 should definitely upgrade.