The chip of a video camera records the averages of light falling into its pixel sensors. Video details finer than the pixel grid are lost in this process, due to the limited resolution. However, a video usually features a significant amount of redundant information:  A single object, say a coffee mug,  might move through the video in several seconds, generating hundreds of viewpoints of the mug from slightly different angles. Interestingly these angles usually do not perfectly align with the pixel grid of the video camera. The interesting question is now whether we can use these averaged measurements together with the information how the object is moving, to recover finer details than before.

The basic idea of video super resolution is just that. The different view points of a single scene in a video are used to enhance the overall resolution and quality.