Audio Support and how it works

Finally the first test version of audio support is ready and this post will describe how it works. Spoiler: It doesn’t work like in Avisynth.

In Python scripts audio nodes are just another unique type. They aren’t stuck together with a video track like in Avisynth. This can be both an advantage and a disadvantage, for example cutting audio and video at the same time by frame numbers will currently require a bit of user scripting. On the other hand with the correct helper functions it’ll be possible to manipulate multiple audio and video tracks as a group easily. Audio also always has a constant format unlike video.

Feedback and downloads are on the Doom9 forum as usual for highly experimental things.

Audio filters

  • BestAudioSource – a new sample accurate but somewhat slow FFmpeg based source filter (usage: core.bas.Source(“rule6.mp4”))
  • BlankAudio – a classic
  • AudioSplice and AudioTrim – with the expected Python overloads of course

Output support

  • VSPipe – outputs raw pcm audio and using the -y switch adds wave64 headers
  • AVFS – uses the audio node assigned to output slot 1
  • VFW – uses the audio node assigned to output slot 1 and video must be assigned to slot 0

An output example

import vapoursynth as vs
audio = vs.core.bas.Source("somefile.mp3", track=-1)
video = vs.core.std.BlankClip()
video.set_output(0)
audio.set_output(1)

API Notes and Changes

Currently the API isn’t completely stable but only minor changes are expected at this point so starting to port plugins. The only mildly breaking change is that the clip type in function argument strings now has been renamed to vnode and obviously the new anode type has been introduced which may confuse existing software tries to parse the argument strings.

The best way to think of audio nodes is as a type completely separate from video nodes that only happen to share some functions to manipulate them. For example you can’t mix audio and video nodes in the same VSMap key or in any other context.