Human-machine cooperative or co-robotics has been recognized as the next generation of robotics. In contrast to current systems that use limited-reasoning strategies or address problems in narrow contexts, new co-robot systems will be characterized by their flexibility, resourcefulness, varied modeling or reasoning approaches, and use of real-world data in real time, demonstrating a level of intelligence and adaptability seen in humans and animals. The research I focused is in the two sub-field of co-robotics: teleoperation and telepresence. We firstly explore the ways of teleoperation using mixed reality techniques. I proposed a new type of display: hybrid-reality display (HRD) system, which utilizes commodity projection device to project captured video frame onto 3D replica of the actual target surface. It provides a direct alignment between the frame of reference for the human subject and that of the displayed image. The advantage of this approach lies in the fact that no wearing device needed for the users, providing minimal intrusiveness and accommodating users eyes during focusing. The field-of-view is also significantly increased. From a user-centered design standpoint, the HRD is motivated by teleoperation accidents, incidents, and user research in military reconnaissance etc. Teleoperation in these environments is compromised by the Keyhole Effect, which results from the limited field of view of reference. The technique contribution of the proposed HRD system is the multi-system calibration which mainly involves motion sensor, projector, cameras and robotic arm. Due to the purpose of the system, the accuracy of calibration should also be restricted within millimeter level. The followed up research of HRD is focused on high accuracy 3D reconstruction of the replica via commodity devices for better alignment of video frame. Conventional 3D scanner lacks either depth resolution or be very expensive. We proposed a structured light scanning based 3D sensing system with accuracy within 1 millimeter while robust to global illumination and surface reflection. Extensive user study prove the performance of our proposed algorithm. In order to compensate the unsynchronization between the local station and remote station due to latency introduced during data sensing and communication, 1-step-ahead predictive control algorithm is presented. The latency between human control and robot movement can be formulated as a linear equation group with a smooth coefficient ranging from 0 to 1. This predictive control algorithm can be further formulated by optimizing a cost function. We then explore the aspect of telepresence. Many hardware designs have been developed to allow a camera to be placed optically directly behind the screen. The purpose of such setups is to enable two-way video teleconferencing that maintains eye-contact. However, the image from the see-through camera usually exhibits a number of imaging artifacts such as low signal to noise ratio, incorrect color balance, and lost of details. Thus we develop a novel image enhancement framework that utilizes an auxiliary color+depth camera that is mounted on the side of the screen. By fusing the information from both cameras, we are able to significantly improve the quality of the see-through image. Experimental results have demonstrated that our fusion method compares favorably against traditional image enhancement/warping methods that uses only a single image.