In order to accomplish machine visualization, object detection algorithms strive to recognise all target items in the image and derive the categories and object location information. Numerous ways to solve this problem have been presented where most of them are inspired by computer vision and deep learning methodologies. Existing techniques, on the other hand, consistently fail to recognise small, dense objects and even fail to detect objects with random geometric modifications. Object skeletons can help in object representation and detection. Skeleton is a natural object’s inherent visual description that contains comprehensive shape semantics. They supplement the object outline by providing additional information, such as how the object scale in thickness that varies between object components. However, extracting object skeletons from natural photos is difficult because the extractor must be able to capture both local and non-local visual contexts in order to calculate the scale of each skeleton pixel. To solve this issue, a Cusp Pixel Labelled Model with Precise Tuned Outline using Machine Learning (CPLM-PTOML) is proposed in this paper that accurately detects the cusp points of the object in the image by extracting the skeleton of the object to recognize the exact object resided in the image. A multi scale associated outputs to each stage of the image by monitoring the relationship between the receptive field sizes of the different layers in the image and the skeleton scales they can capture. The model is trained via multi-task learning, with one job being skeleton localization, which determines whether or not a pixel is a skeleton pixel, and the other being skeleton scale prediction, which predicts the scale of each skeleton pixel. At various phases, supervision is imposed by directing the scale-associated side outputs toward the ground-truth skeletons at the appropriate scales. The proposed model is compared with the traditional model in terms Data Training Samples, Object Feature Extraction Time Levels, Feature Extraction Accuracy, Cusp Pixel Labelling Time Levels, Cusp Pixel Identification Accuracy, Cusp Point Linking Accuracy, Image Considered and Cusp Point Recognition levels and the results exhibit that the proposed model exhibits better outcome.