In this paper, we present a method for temporal propagation of depth data that is available for so called key-frames through video sequence. Our method requir es that full frame depth information is assigned. Our method utilizes nearest preceding and nearest following key-frames with known depth information. The propagation of depth information from two sides is essential as it allows to solve most occlusion problems correctly. Image matching is based on the coherency sensitive hashing (CSH) method and is done using image pyramids. Disclosed results are compared with temporal interpolation based on motion vectors from optical flow algorithm. The proposed algorithm keeps sharp depth edges of objects even in situations with fast motion or occlusions. It also handles well many situations, when the depth edges dont perfectly correspond with true edges of objects. Keywords: 2D to 3D video conversion, video tracking, depth interpolation, patch hashing 1. INTRODUCTION Nowadays, huge gain of interest to 3D stereoscopic video forced multimedia industry to produce more and more of stereo content. This need attracted much attention to conversion techniques, i.e. producing 3D stereo content from common 2D monocular video. Algorithms for creating stereo video from mono stream can be roughly divided in two main subcategories: fully automatic conversion which are implemented most often via specialized chip inside TV, and semi-automatic (operator-assisted) conversion, using special application, tools for marking videos, serious quality control. In this paper, we will focus on the most challenging problem that arises in semi-automatic conversion: temporal propagation of depth data. Operator assisted pipeline for stereo content production usually involves manual drawing of depth for some selected reference frames (key-frames), and subs equent depth propagation for stereo rendering. An initial depth assignment is done sometimes by drawing just disparity scribbles, and after that restored and propagated using 3D cues. In others, full key-frame depth is needed. The bottleneck of this process is quality of propagated depth: if the quality is not high enough, a lot of visually disturbing artifacts appear on final stereo frames. The propagation quality strongly depends on the frequency of manually assigned key -frames, but drawing a lot of frames requires more manual work and makes production slower and more expensive. Thats why the very crucial problem is error-free temporal propagation of depth data through as many as possible frames. Optimal key-frame distance for desired quality of outputs is highly dependent on properties of video sequence and can change significantly within one sequence. 1.1 Prior art The problem of temporal propagation of key-frame depth data is a problematic task. Video data are temporally under-sampled, they contain noise, motion blur and optical effects such as reflections, flares. Moreover objects in the scene can disappear, get occluded or can significantly change shape. The traditional method of depth interpolation uses motion estimation result either on depth or video images. Varenkampf [1 ] proposes to create first estimate of depth by bilateral filtering of previous depth image and then correct the image by estimating motion between depth frames. Similar approach is described in [2]. Harman et al in [3] use machine learning approach for depth assignment for key-frames, which they suggest to select manually, or apply techniques similar for shot-boundary detection. After few points are assigned, a classifier (separate for each key frame) is trai ned using small number of samples, and restore depth in key frame. After that a procedure called depth tweening restores intermediate depth frames. For this purpose, both classifiers of neighbor key-frames are fed with image value to produce depth value. For final depth both intermediate depths are weighted by the distance to key frames. Weights could linearly depend on the time distance, but authors propose to use non-linear time-weight dependence. The problem of such approach could arise when intermediate video frame have areas completely different that can be found on key frames (for example, in occlusion areas). However, this situation is difficult for the majority of depth propagation algorithms. Article [14] describes propagation method based on generation of superpixels, matching them and generating depth using matching results and key frames depth.