R14 – Improved Packaging

It’s time for another release. The main highlights are bug fixes and the possibility to set the max cache size. The default max is 1GB everywhere to prevent running out of address space in 32bit applications. I’ve also decided to start bundling some potentially useful filters:

  • TemporalSoften – C code restored from inline asm
  • Histogram – with support for higher bitdepths in the default mode
  • VIVTC – think TIVTC but less features and portable
  • EEDI3 – improved to work on all 8bit formats

Since this project’s success depends a lot on good plugins being available I’ve also included all files needed to start developing in the installer. Just check out the SDK directory and the example invert filter to get started.

It’s now time for me to start spending more time porting/rewriting some useful Avisynth things…

Open Binary – Introducing a Practical Alternative to Open Source

I’ve been thinking about not only announcing releases and features, but also discuss (read: point and laugh at) some very common annoyances. I hope to one day be seen as a meaner alternative to The Daily WTF but with less free mugs and more open source. Anyway, on to this post’s subject…

Open Binary – source code so obfuscated, “optimized” and arcane that despite an open source license nobody can edit or benefit from reading it. Your only hope is to compile it into a binary and hope it works. There are plenty of examples of this in the video world, such as mplayer, most popular Avisynth filters and to be honest almost every single piece of code written in the field of video processing.

So how do you produce an Open Binary? Well, in my opinion you have to put effort into multiple levels to succeed. For example one important step is to OPTIMIZE! And by that I mean bitwise shifts! No compiler can ever figure out that a/2 can be compiled to a right shift so you have to help it, it also makes the code faster. Another important detail to know about CPUs, even the most modern ones, are that they are slow readers. Armed with this knowledge make all variable names short so there’s less to read for the poor CPU. For text parsing we can actually do one better since modern CPUs are good at numbers, simply use the ascii code instead of the letter in any text operation.

An example of proper text parsing taken from TIVTC:

if (*linep != 0)
{
	qt = -1;
	d2vmarked = false;
	*linep++;
	q = *linep;
	if (q == 112) q = 0;
	else if (q == 99) q = 1;
	else if (q == 110) q = 2;
	else if (q == 98) q = 3;
	else if (q == 117) q = 4;
	else if (q == 108) q = 5;
	else if (q == 104) q = 6;
	else
	{
		fclose(f);
		f = NULL;
		env->ThrowError("TFM:  input file error (invalid match specifier)!");
	}
	*linep++;
	*linep++;
...continued for several hundred lines

There are several other techniques you can use too, for example writing pure assembler, or even better, inline assembler which effectively will tie all your code to one platform and compiler too at the same time in addition to being near impossible for anyone to modify or understand! You can also play the shell game with pointers and global variables, have one function add an offset to a pointer and pass it to the next which subtracts it again. The secret it is to put spaghetti in your sauce, so to say.

So who should use this approach to leverage the open source benefits? Big evil companies of course! Sure, you’ll have to reveal the source code but no one can ever use it for anything anyway. This is the end of part one of my “Business Strategies for the Modern Monopolist” series. I’ll be posting part two shortly.

Here’s a final example of successful use of assembler only to make a true Open Binary, again from TIVTC as I’ve spent far too much time staring at it recently. The actual post ends here so you don’t have to scroll down to look for more.

__asm
{
	mov y, 2
yloop:
	mov ecx, y0a
	mov edx, y1a
	cmp ecx, edx
	je xloop_pre
	mov eax, y
	cmp eax, ecx
	jl xloop_pre
	cmp eax, edx
	jle end_yloop
xloop_pre:
	mov esi, incl
	mov ebx, startx
	mov edi, mapp
	mov edx, mapn
	mov ecx, stopx
xloop:
	movzx eax, BYTE PTR [edi+ebx]
	shl eax, 3
	add al, BYTE PTR [edx+ebx]
	jnz b1
	add ebx, esi
	cmp ebx, ecx
	jl xloop
	jmp end_yloop
b1:
	mov edx, curf
	mov edi, curpf
	movzx ecx, BYTE PTR[edx+ebx]
	movzx esi, BYTE PTR[edi+ebx]
	shl ecx, 2
	mov edx, curnf
	add ecx, esi
	mov edi, prvpf
	movzx esi, BYTE PTR[edx+ebx]
	movzx edx, BYTE PTR[edi+ebx]
	add ecx, esi
	mov edi, prvnf
	movzx esi, BYTE PTR[edi+ebx]
	add edx, esi
	mov edi, edx
	add edx, edx
	sub edi, ecx
	add edx, edi
	jge b3
	neg edx
b3:
	cmp edx, 23
	jle p3
	test eax, 9
	jz p1
	add accumPc, edx
p1:
	cmp edx, 42
	jle p3
	test eax, 18
	jz p2
	add accumPm, edx
p2:
	test eax, 36
	jz p3
	add accumPml, edx
p3:
	mov edi, nxtpf
	mov esi, nxtnf
	movzx edx, BYTE PTR[edi+ebx]
	movzx edi, BYTE PTR[esi+ebx]
	add edx, edi
	mov esi, edx
	add edx, edx
	sub esi, ecx
	add edx, esi
	jge b2
	neg edx
b2:
	cmp edx, 23
	jle p6
	test eax, 9
	jz p4
	add accumNc, edx
p4:
	cmp edx, 42
	jle p6
	test eax, 18
	jz p5
	add accumNm, edx
p5:
	test eax, 36
	jz p6
	add accumNml, edx
p6:
	mov esi, incl
	mov ecx, stopx
	mov edi, mapp
	add ebx, esi
	mov edx, mapn
	cmp ebx, ecx
	jl xloop
end_yloop:
	mov esi, Height
	mov eax, prvf_pitch
	mov ebx, curf_pitch
	mov ecx, nxtf_pitch
	mov edi, map_pitch
	sub esi, 2
	add y, 2
	add mapp, edi
	add prvpf, eax
	add curpf, ebx
	add prvnf, eax
	add curf, ebx
	add nxtpf, ecx
	add curnf, ebx
	add nxtnf, ecx
	add mapn, edi
	cmp y, esi
	jl yloop
}

R13 – Conditional Filtering and Memory Optimizations

It’s time for another release since it’s been over a week. The new things are a redone system for accessing frame properties and this time it’s less awkward and arcane, the possibility to write a full filter in python only (if you’re clever enough to figure out how to abuse ModifyFrame) and the memory management has been enabled. This means that VapourSynth will aggressively try to keep the amount of used framebuffer memory below 1GB to avoid running out of address space.

I also added all useful internal Avisynth filters turned into a standalone plugin to the downloads. It should make the transition to VapourSynth easier while waiting for your favorite internal filter to be ported. If your favorite filter happens to be a simple one I suggest you give porting it a try yourself.

As I feel the core is almost complete now I will focus on creating more automated regression tests, documentation everything and tweaking the automatic cache size adjustment for the next release. Phase one of the project is nearing the end. After that I will focus on porting popular filters properly, as in making them work on Windows, Linux and OSX in both 32 and 64bit mode. As some of you may have noticed I’ve already ported EEDI3 and I’m currently working on TIVTC, a difficult project (but not for the right reasons) which I’ll write another post about.

R12 – VapourSynth Takes a Step in The Enterprise Direction

This new version has something for everyone.  It has a round of bug fixes, one of them to the threading which means that it should be able to completely max out a 4 core CPU when running mdegrain2. The other features (requested through donations) are support for v210 output, the most used 10 bit format in professional video editing. To enable this output in VSFS and VFW add this to your script:

last = yourvideo
enable_v210=True

If you do not add it the output will default to P210. The documentation will be updated later today with more detailed installation instructions for VSFS. For those of you who can’t wait the install method is very similar to AVFS (just look for vsfs.dll).

R11 – VFW returns and Python 3.3

VFW has been debugged. Greatly. I’ve also added high bitdepth output support. However v210 shall be left out (I hate packed formats) unless someone contributes a patch or requests it in a donation message. VFW also has some behavioral changes such as returning a clip with colorful bars on error. This version also requires the Python 3.3 as this cuts down on the number of copies of visual studio I have to keep around.

Note that it is possible and quite easy to recompile the python and VFW modules for other versions. Maybe someone will contribute a python 2.7 compile one day.