This thesis contributes to the research area of Computer-Vision-based human motion analysis, investigates techniques associated in this area, and proposes a human motion analysis system which parses images or videos (image sequences) to estimate human poses. A human motion analysis system that combines a novel colour-to-greyscale converter, an optimised Histogram of Orientated Gradients (HOG) human body detector, and an improved Generalised Distance Transform and Orientation Maps (GDT&OM) pose estimator, is built to execute key-frame extraction. The novel colour-to-greyscale conversion method that converts RGB images to chroma-edge-enhanced greyscale images by employing density-based colour clustering and spring-system-based multidimensional scaling, is proved to be superior compared with other methods such as Color2Grey and Ren’s method. The weakness of the novel method is that it is still parameter dependent and does not perform well for some images. We make improvement on Histogram of Orientated Gradients by employing a modified training scheme and using pre-processed data, and the performance is improved by achieving similar true detection rate but much lower false detection rate, compared with the original HOG scheme. We discuss the GDT&OM method and develop the original GDT&OM human detector to a human pose estimator using results of human detection. Meanwhile we also investigate a pose estimation method based on locations and orientations of human body parts under the assumption of body parts can be accurately located. Then we integrate all methods to build a key-frame extraction system which is more intelligent than conventional approaches as it is designed to select frames representing content of videos. We finally apply our methods to build a video logging system, which automatically records actions of gymnastic videos according to the actions displayed. Both systems perform well for a small set of motion categories. However they are object-dependent systems that need users to manually select target objects, and the performance is limited by the human body detector and pose estimator.