Person re-identification (re-ID) is an instance-level recognition mission, which depends on specific discriminative features. However, the learned features from the network are very easily influenced and weakened by the semantic misalignment of the target and background clutters. In this work, an original deep re-ID CNN is proposed, named semantically consistent attention network (SCANet), to learn the discriminative feature. In this work, we realize the purpose by designing a foreground mask module on backbone network composed of residual block. Importantly a novel consistent attention loss is introduced to dynamically keep the inferred foreground mask similar from the shallow-level, mid-level and deep-level feature maps. As a consequence, the network will concentrate on the foreground areas at the shallow layer, which is very beneficial for learning discriminative features from the foreground areas at the deeper layers. Extensive experiments demonstrate that the semantically consistent attention network can cope with the above challenges. The results display that the matching rate (Rank-1) of the proposed algorithm on datasets Market1501, DukeMTMC-reID and CUHK03 is 93.9%, 87.3%, 73.4%, and the mean Average Precision (mAP) is 90.8%, 82.2%, 75.2%, respectively.