With the dramatic advances in deep learning technology, machine learning research is focusing on improving the interpretability of model predictions as well as prediction performance in both basic and applied research. While deep learning models have much higher prediction performance than conventional machine learning models, the specific prediction process is still difficult to interpret and/or explain. This is known as the black-boxing of machine learning models and is recognized as a particularly important problem in a wide range of research fields, including manufacturing, commerce, robotics, and other industries where the use of such technology has become commonplace, as well as the medical field, where mistakes are not tolerated.Focusing on natural language processing tasks, we consider interpretability as the presentation of the contribution of a prediction to an input word in a recurrent neural network. In interpreting predictions from deep learning models, much work has been done mainly on visualization of importance mainly based on attention weights and gradients for the inference results. However, it has become clear in recent years that there are not negligible problems with these mechanisms of attention mechanisms and gradients-based techniques. The first is that the attention weight learns which parts to focus on, but depending on the task or problem setting, the relationship with the importance of the gradient may be strong or weak, and these may not always be strongly related. Furthermore, it is often unclear how to integrate both interpretations. From another perspective, there are several unclear aspects regarding the appropriate application of the effects of attention mechanisms to real-world problems with large datasets, as well as the properties and characteristics of the applied effects. This dissertation discusses both basic and applied research on how attention mechanisms improve the performance and interpretability of machine learning models.From the basic research perspective, we proposed a new learning method that focuses on the vulnerability of the attention mechanism to perturbations, which contributes significantly to prediction performance and interpretability. Deep learning models are known to respond to small perturbations that humans cannot perceive and may exhibit unintended behaviors and predictions. Attention mechanisms used to interpret predictions are no exception. This is a very serious problem because current deep learning models rely heavily on this mechanism. We focused on training techniques using adversarial perturbations, i.e., perturbations that dares to deceive the attention mechanism. We demonstrated that such an adversarial training technique makes the perturbation-sensitive attention mechanism robust and enables the presentation of highly interpretable predictive evidence. By further extending the proposed technique to semi-supervised learning, a general-purpose learning model with a more robust and interpretable attention mechanism was achieved.From the applied research perspective, we investigated the effectiveness of the deep learning models with attention mechanisms validated in the basic research, are in real-world applications. Since deep learning models with attention mechanisms have mainly been evaluated using basic tasks in natural language processing and computer vision, their performance when used as core components of applications and services has often been unclear. We confirm the effectiveness of the proposed framework with an attention mechanism by focusing on the real world of applications, particularly in the field of computational advertising, where the amount of data is large, and the interpretation of predictions is necessary. The proposed frameworks are new attempts to support operations by predicting the nature of digital advertisements with high serving effectiveness, and their effectiveness has been confirmed using large-scale ad-serving data.In light of the above, the research summarized in this dissertation focuses on the attention mechanism, which has been the focus of much attention in recent years, and discusses its potential for both basic research in terms of improving prediction performance and interpretability, and applied research in terms of evaluating it for real-world applications using large data sets beyond the laboratory environment. The dissertation also concludes with a summary of the implications of these findings for subsequent research and future prospects in the field.