Multi-epoch interferometric synthetic aperture radar (InSAR) is a highly effective technique for monitoring deformation in urban areas. However, interpreting InSAR deformation can be challenging due to various factors, including inherent geometric imaging distortion, the intricate structure and deformation properties of targets in urban scenes, and the multiple scattering of microwave signals between objects in urban scenes. This paper discusses the challenges involved in interpreting time-series InSAR deformation: (1) Precisely identifying the location of deformation signals and linking them to their corresponding objects, i. e., determining where the deformation signal occurs, (2) understanding the mechanisms and factors that cause the detected deformation signals, i.e., determining what the deformation signal represents, (3) establishing the connection among the detected deformation signals, the deformation events, and the scattering mechanisms. We suggest a parametric framework to improve the accurate interpretation of InSAR deformation. This framework includes several factors, including kinematic characteristics (deformation rate, cumulative deformation, deformation gradient, and deformation model), geometric parameters (position, size, structure, orientation, and roughness), semantic information (land cover type, terrain morphology, texture, and auxiliary information on natural and anthropogenic disturbance) and physical properties (scattering mechanism, penetrability, extensibility, conductivity, and thermal conductivity). Our approach aims to enhance the representation of coherent points for a better understanding of InSAR deformation. This paper offers a comprehensive overview of the advancements achieved in extracting parameters of InSAR coherent points and interpreting deformation based on geometric parameters, semantic information, and physical properties. High-precision 3D positioning is crucial for InSAR fine monitoring in urban areas. It helps determine the source of deformation signals and facilitates the analysis of deformation mechanisms. Semantic information, such as 3D models, high-resolution optical images, laser point cloud data, and land use data, can aid in interpreting InSAR deformation. By combining InSAR deformation data with a deep learning approach, there is an opportunity to interpret deformations effectively. In urban environments, the scattering mechanism of ground objects is complex. Multiple scattering signals can provide effective observations of deformation and information about the target's size. However, combining the scattering mechanism of synthetic aperture radar signals to carry out parameter inversion and deformation mechanism interpretation of urban target terrain remains a challenge. The framework, which considers the geometric parameters, semantic information, and physical attributes of InSAR coherent points, will be crucial for deformation interpretation and mechanism cognition. This framework will enable fine deformation monitoring, intelligent recognition, and application in future urban scenes. [ABSTRACT FROM AUTHOR]