GSoC First Month
The Background
I became interested with video compression in first year university. I had an open assessment to count the amount of bugs in a live stream and I found it very enjoyable.
Once my interest swung to manipulating video from the bit level, I found myself in the library reading about MPEG and the history of video compression standards (weird leap I know).
My university required me to find a one year placement, working somewhere to gain experience as part of my course. After a while and few interviews later I found myself working for TNO in the Netherlands on the Versatile Video Codec standard. I was looking at implementing a specific way of encoding motion vectors within 360 video sequences. This got me deep in the code base and I learnt more than I imagined I would about codec tools and their uses.
Due to Coronavirus, I was let go from the company and I applied to GSoC and contribute to the VideoLan community through a decoder called dav1d. I have the goal of creating a metadata extraction tool for dav1d.
My motivations are diverse. Firstly, I would like to help the VideoLan community in developing dav1d in a more efficient way. Also I would like to think the decoder would be used by quite a few people and lower some kind of bandwidth somewhere.
But why is my work necessary in the first place, I hear you ask! The current codec specifications for VVC and AV1 have too many coding tools. These tools are what make it possible to get to such low file sizes. The way I look at it, if each letter in the alphabet was a tool to communicate, you could still convey your thought without certain letters, it might just take more sentences. Choosing what tools to implement from a specification in a codec can be hard. what gives the best bang for the buck? Hopefully my metadata extractions will help with this.
The GSoC Journey so Far
The challenge of working from my parents house is bigger than you might think. I work, sleep and socialise in the same room, and have been doing for what feels like forever! I am normally a person who likes face to face communication and it has been hard for me to rely on written chats and video calls. I had never used the IRC platform before and I had limited use of Git. In that respect, I feel I have come leaps and bounds from where I was earlier this month.
I have learnt the basics of Git, clone, push, add, commit, forks and branches. I simply would not have been able to learn about this stuff if it wasn't for the great support from the VideoLan Community. I have received valuable feedback on my merge requests and it has helped me to improve my coding skills. I have contributed in fixing a small issue #286. This was to include input for dav1d from standard input. This helped me learn more about the dav1d code base and more importantly asking for help if I am stuck.
I would to like to think my contributions in the bi-weekly meetings are good and accepted. I attend without fail every week because I like to hear the amazing progress that has been done and to share my contributions too.
To set some sort of a ground truth which I know to be correct, I encoded some sequences using a collection of AV1 encoders. This is super helpful because I can decode these as tests to see if I am extracting the correct data or not. Ask for the link in the comments and I will give you access to the collection.
I have also created some of my own sequences using python and the PIL image library. They consist of squares translating across a black background. These are necessary to debug when a noisy natural sequence is unpredictable.
To prove I can extract a piece of meta data without trouble I first, I specified a '--debug metadata' argument which, although not working 100% yet, starts outputting qp values in JSON format. I also do some preliminary calculation on the qp values to determine the key values rather than output all of them :) The big idea is to have a GUI once the metadata is extracted to visualise the data.
In summary, the first month of GSoC has been amazing. I have been introduced to a group of people who are so inspiring and helpful. I hope this is only the beginning of my contributions to video codecs.
- Tom.
mentions: I believe the background picture to be a rainbow-vomit-tiger (super rare animal and pretty sick! pardon the pun) I have know idea on the rights to the image but it's here and here to stay :)
Comments
Post a Comment