Tuesday 16 November 2010

How to set up git hosting on COMPSOC

If you already have a local git repository that you want to host on compsoc, then you can skip this first step.

Initialise a new git repository in the directory of the project you want to host (this will be the copy you have at home, for example)

[simon@simon ~]$ cd ~/project/foo/
[simon@simon foo (master #)]$ git init
Initialized empty Git repository in /home/simon/project/foo/.git/

Log in to compsoc, and create a new directory where you want your repository stored and initialise a bare git repository there.

[simon@forgetful ~]$ mkdir foo.git
[simon@forgetful ~]$ cd foo.git/
[simon@forgetful foo.git]$ git --bare init
Initialized empty Git repository in /data/home/simon/foo.git/
[simon@forgetful foo.git (BARE:master)]$

On your home machine you will need to add compsoc as a remote, here I've referred to compsoc as "origin".

[simon@simon foo (master #)]$ git remote add origin simon@forgetful.compsoc.man.ac.uk:foo.git

Do some editing, commit some changes locally (i.e. at home).

[simon@simon foo (master #)]$ vim hello_world.c
[simon@simon foo (master #)]$ git add hello_world.c
[simon@simon foo (master #)]$ git commit -m "Added test file"
[master (root-commit) 76e25d7] Added test file
 1 files changed, 7 insertions(+), 0 deletions(-)
 create mode 100644 hello_world.c

Use git push to push your local changes to compsoc.

[simon@simon foo (master)]$ git push origin master
Counting objects: 3, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 302 bytes, done.
Total 3 (delta 0), reused 0 (delta 0)
To simon@forgetful.compsoc.man.ac.uk:foo.git
 * [new branch]      master -> master

If you want to put your git repository on the internet, either copy or symlink the foo.git directory to be in your public_html directory and run git update-server-info.

[simon@forgetful public_html]$ cd ~/public_html/
[simon@forgetful public_html]$ ln -s ../foo.git/
[simon@forgetful ~]$ cd foo.git/
[simon@forgetful foo.git (BARE:master)]$ git --bare update-server-info
[simon@forgetful foo.git (BARE:master)]$ chmod a+x hooks/post-update

Now anyone can come along and clone your git repository.

[simon@simon ~]$ cd /tmp
[simon@simon tmp]$ git clone http://compsoc.man.ac.uk/~simon/foo.git
Cloning into foo...
[simon@simon tmp]$ cd foo/
[simon@simon foo (master)]$ cat hello_world.c
#include 

int main() {
    printf("hello, world!\n");
    return 0;
}

Thanks for reading!

Saturday 25 September 2010

CABAC

Context-adaptive binary arithmetic coding is (very roughly) a lossless compression method that relies on knowlege of the current state of decoding to make probabalistic decisions about what value for a certain element is about to come next. For example, if the previous ten macroblocks were all interlaced then there's a good chance the next one will be too, and it will cost more bits to signal that the next macroblock isn't interlaced.

This mostly already worked for mbaff due to the way the code was designed. The most major part was understanding that the decoder might not have recieved field decoding flag by the time the code to determine whether a macroblock is skipped runs. The skip code relied on neighbour calculation using the macroblock's field/frame flag, but the decoder didn't yet know this. Thus x264 needed to store what the decoder thinks the field decoding flag is and update it only when it's actually written to the bitstream.

Motion vectors

This I don't understand. Michael Niedermayer (the writer of ffmpeg's h264 decoder) sums up the situation quite succinctly, quoting from libavcodec/h264.h:

    /* Wow, what a mess, why didn't they simplify the interlacing & intra
     * stuff, I can't imagine that these complex rules are worth it. */

First of all, motion vectors from fields to other fields and frames to other frames aren't changed. However, when predicting from a frame to a field, the fact that a field has half the vertical resolution of the frame means that the vertical component of the motion vector will need to be halved to reference the same spatial areas. Similarly the motion vector from a field to a frame must be doubled. This is easily achieved with a few bit shifts (good old integer arithmetic!) as so:

        if( h->sh.b_mbaff )
        {
#define MAP_MVS\
                MAP_F2F(mv, ref, x264_scan8[0] - 1 - 1*8, h->mb.i_mb_topleft_xy)\
                MAP_F2F(mv, ref, x264_scan8[0] + 0 - 1*8, top)\
                MAP_F2F(mv, ref, x264_scan8[0] + 1 - 1*8, top)\
                MAP_F2F(mv, ref, x264_scan8[0] + 2 - 1*8, top)\
                MAP_F2F(mv, ref, x264_scan8[0] + 3 - 1*8, top)\
                MAP_F2F(mv, ref, x264_scan8[0] + 4 - 1*8, h->mb.i_mb_topright_xy)\
                MAP_F2F(mv, ref, x264_scan8[0] - 1 + 0*8, left[0])\
                MAP_F2F(mv, ref, x264_scan8[0] - 1 + 1*8, left[0])\
                MAP_F2F(mv, ref, x264_scan8[0] - 1 + 2*8, left[1])\
                MAP_F2F(mv, ref, x264_scan8[0] - 1 + 3*8, left[1])\
                MAP_F2F(topright_mv, topright_ref, 0, left[0])\
                MAP_F2F(topright_mv, topright_ref, 1, left[0])\
                MAP_F2F(topright_mv, topright_ref, 2, left[1])

            if( h->mb.b_interlaced )
            {
#define MAP_F2F(varmv, varref, index, macroblock)\
                if( h->mb.cache.varref[l][index] >= 0 && !h->mb.field[macroblock] )\
                {\
                    h->mb.cache.varref[l][index] <<= 1;\
                    h->mb.cache.varmv[l][index][1] /= 2;\
                    h->mb.cache.mvd[l][index][1] >>= 1;\
                }
                MAP_MVS
#undef MAP_F2F
            }
            else
            {
#define MAP_F2F(varmv, varref, index, macroblock)\
                if( h->mb.cache.varref[l][index] >= 0 && h->mb.field[macroblock] )\
                {\
                    h->mb.cache.varref[l][index] >>= 1;\
                    h->mb.cache.varmv[l][index][1] <<= 1;\
                    h->mb.cache.mvd[l][index][1] <<= 1;\
                }
                MAP_MVS
#undef MAP_F2F
            }

Here you can see I use a macro to simplify the repeated code, this is actually inspired by ffmpeg so thanks for the good ideas guys!

Diagonal motion vectors are bizarre to say the least. Top left motion vectors, in some situations, are required to come from the middle of the macroblock as opposed to the usual bottom right location. This was the source of much frustration for a while. Top right motion vectors, for certain partitions, and when the top right is unavailable, are taken from strange places in the top left macroblock.

Intra coding

When decoding video it is common that the next macroblock to be decoded will look somewhat like the surrounding macroblocks. H.264 decodes macroblocks in so called ``scan'' order, i.e. do the first row, left to right, then move on to the next row and repeat. Imagine a scene with a large section of sky covering the top half of the frame. A good guess to the content of the next macroblock might be the shade of blue that is directly surrounding it (from the already decoded macroblocks above and to the left). So we guess that the macroblock we're encoding is blue, then code the difference between that and the actual colour values. If our macroblock is close to the prediction, which it usually is, then we don't need to code much at all. This is for the most part how intra coding works. The name comes from the fact that no data outside of the current picture is accessed, as opposed to inter coding that looks at the previous and future frames to help compression.

MBAFF changes which neighbours need to be referenced when predicting the next macroblock. This means that it's not usually just the top and left macroblocks that the pixels for intra prediction come from. The rules for which macroblock to reference are fairly convoluted, but do make sense from a compression point of view. I have several pages of diagrams trying to make sense of it!

Monday 26 April 2010

Diary of an x264 (mbaff) developer

I've just found out officially that I've been accepted into the Summer of Code program! This means that following university exams I'll be working developing a new interlaced encoding method for my favourite video encoder: x264.

Interlacing is the process of taking two consecutive pictures (called fields) of video that are often captured slightly after one another and merging them together into a single frame. Many professional video cameras do this in their usual operation. The up-side of this is you can have the high frame rate that comes with presenting two pictures in the space of one, but the downside is that each field has half the vertical resolution of the equivalent frame.

Macroblock adaptive frame-field interlacing, or mbaff for short, is a method of encoding interlaced video that makes a decision on whether or not each individual macroblock should be interlaced. The traditional method, and that currently implemented in x264, is to keep each frame fully interlaced. However, it is preferable to encode a macroblock with low movement in it as progressive, as this is more efficient. MBAFF interlacing is one of the few major features that most H.264 video encoders currently support that's missing in x264.

I hope to see significant improvements in the efficiency of interlaced encoding when this project is completed. Until then, I'll post regular updates of my progress.

Monday 1 March 2010

Google Code Jam 2009 1C: Bribe the Prisoners

Problem
The problem statement describes a prison with adjacent cells and a set of prisoners to be released. The problem is essentially to find the ordering of prisoners to be released such that the total number of occupied adjacent cells is minimal (and thus less coins need to be handed out).

Solution
Let tex:\{c_1,c_2,\dots,c_Q\} be the set of cells in which a prisoner to be released tex:p_i resides. Assume tex:p_i is the optimal prisoner to be released first. The first thing we need to do is to give every prisoner left a gold coin as they all find out about tex:p_i leaving. We are now left with two sub-arrays tex:[1,\dots,c_i-1] and tex:[c_i+1,\dots,P] where we need to calculate the next best prisoner to be released in each array. For the sub-array tex:[1,\dots,c_i-1] we can iterate over the prisoners to be released residing in this range and find for each one the minimum number of coins if they were to leave and split the sub-array (making two more similar problems).

Note: The minimum number of coins for a sub-array [a,b] can be memoised as the solutions to the sub-problems overlap.

Here is the solution code provided by Google with a couple of additional comments:

int p[200]; // prisoners to be released.
map<pair<int, int>, int> dp;

// Finds the minimum amount of gold needed, 
// if we only consider the cells from a to b, inclusive.
int Solve(int a, int b) {
  // First, look up the cache to see if the 
  // result is computed before.
  pair<int, int> pr(a, b);
  if(mp.find(pr) != mp.end()) return mp[pr];

  // Start the computation.
  int r = 0;
  // For each prisoner to be released in the range a<=p_i<=b
  for(int i=0; i<Q; i++) {
    if(p[i] >= a && p[i] <= b) {
      // b-a prisoners find out about p[i] leaving and we are left with two
      // sub-arrays that need to be solved in a similar fashion.
      int tmp = (b-a) + Solve(a, p[i]-1) + Solve(p[i]+1, b);
      // Is the number of coins for the release of p[i] minimal?
      if (!r || tmp<r) r=tmp;
    }
  }
  mp[pr]=r;
  return r;
}

Wednesday 24 February 2010

Dijkstra

#include <algorithm>
#include <iostream>
#include <limits>
#include <list>
#include <queue>
#include <vector>
using namespace std;

//  Implementation of Dijkstra's algorithm, using the example graph from p596 of
//  Introduction to Algorithms. As STL's priority_queue does not contain a
//  decrease-key function, the vertex is simply added multiple times and only
//  the lowest distance (first encountered on queue) is considered. Note that 
//  the number of nodes in the graph is fixed to keep code simple.

typedef pair<int,int> pii;

const int max_nodes = 5;
int distances[max_nodes];
bool visited[max_nodes];

void Dijkstra(list<pii> Adj[max_nodes], int s)
{
    for(int i=0;i<max_nodes;++i)
    {
        distances[i] = numeric_limits<int>::max();
        visited[i] = false;
    }
    distances[s] = 0;

    priority_queue<pii, vector<pii>, greater<pii> > Q;// pair(distance, vertex)
    Q.push(make_pair(distances[s],s));

    while(!Q.empty())
    {
        int u = Q.top().second;
        Q.pop();

        // If the vertex has been visited (by a shorter path) while it was in
        // the queue ignore it.
        if(visited[u]) continue;
        visited[u]=true;

        // Relax each adjacent vertex that hasn't already been visisted. If it
        // has been visited it will have been by a shorter path.
        for(list<pii>::iterator v=Adj[u].begin(); v != Adj[u].end(); ++v)
        {
            // If d(s,u) + d(u,v) < d(s,v) update d(s,v).
            if(!visited[v->first] && distances[v->first] > distances[u] + v->second)
            {
                distances[v->first] = distances[u] + v->second;
                Q.push(make_pair(distances[v->first],v->first));
            }
        }
    }
}

int main()
{
    list<pii> Adj[max_nodes];
    Adj[0].push_back(make_pair(1,10));    // (s,t)
    Adj[0].push_back(make_pair(3,5));    // (s,y)
    Adj[1].push_back(make_pair(2,1));    // (t,x)
    Adj[1].push_back(make_pair(3,2));    // (t,y)
    Adj[2].push_back(make_pair(4,4));    // (x,z)
    Adj[3].push_back(make_pair(1,3));    // (y,t)
    Adj[3].push_back(make_pair(2,9));    // (y,x)
    Adj[3].push_back(make_pair(4,2));    // (y,z)
    Adj[4].push_back(make_pair(0,7));    // (z,s)
    Adj[4].push_back(make_pair(2,6));    // (z,x)

    Dijkstra(Adj, 0);

    for(int i=0;i<max_nodes;++i)
    {
        cout << "distances["<<i<<"] = " << distances[i] << ".\n";
    }
}