Urban functional zones (UFZs) are essential for characterizing both urban spatial configurations and socio-economic properties and monitoring urbanization process, thus UFZs are fundamental to urban planning, management and renewal. Although many efforts in remote sensing field have been made to map UFZs, large-scale and fine-grained UFZ maps required by a broad range of urban applications are still unavailable. Existing methods generally rely on pre-determined mapping units, such as image tiles and road blocks, which significantly limit the mapping quality and the automation degree of mapping UFZs. Given that UFZs are composed of diverse geographic objects, this study proposes a novel object-based UFZ mapping method using very-high-resolution (VHR) remote sensing images. First, a multi-scale semantic segmentation network that achieves pixel-wised predictions is proposed to predict urban-functions for geographic objects by capturing multi-scale contextual information. Afterwards, a conditional random field (CRF) framework is designed to regroup objects into UFZs to produce the final UFZ map, wherein road vectors are incorporated to restrict the procedure. The presented object-as-analysis-unit scheme conquers the drawbacks of mapping-unit pre-determination and the semantic segmentation model provides accurate function information for objects, thus they can be applied for producing large-scale and fine-grained UFZ maps. In the experiment, the proposed method is evaluated by producing UFZ maps for Beijing and Shanghai, China, and competitive results with overall accuracy of 91.6% and 89.1% are achieved, respectively. Finally, the generated UFZ maps are utilized to analyze the urban-function structures of the two cities. The proposed method can be regarded as a significant development that appears to be promising and practical for mapping UFZ maps for real-world urban applications.