Georgios Tziritas, Yeonggul Jang, Jin Ma, Fumin Guo, Quanzheng Li, Tiancong Hua, Xiang Li, Lihong Liu, Angélica Atehortúa, James R. Clough, Zhiqiang Hu, Eric Kerfoot, Vicente Grau, Enzo Ferrante, Matthew Ng, Guanyu Yang, Mireille Garreau, Alejandro Debus, Elias Grinias, Jiahui Li, Wufeng Xue, Shuo Li, Wenjun Yan, Ilkay Oksuz, Hao Xu, Shenzhen University, Beijing University of Posts and Telecommunications (BUPT), Peking University [Beijing], King‘s College London, Istanbul Technical University (ITÜ), University of Oxford [Oxford], University of Toronto, Massachusetts General Hospital [Boston], University of Crete [Heraklion] (UOC), Fudan University [Shanghai], Universidad Nacional de Colombia [Bogotà] (UNAL), Laboratoire Traitement du Signal et de l'Image (LTSI), Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National de la Santé et de la Recherche Médicale (INSERM), Centre de Recherche en Information Biomédicale sino-français (CRIBS), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Southeast University [Jiangsu]-Institut National de la Santé et de la Recherche Médicale (INSERM), Yonsei University, Universidad Nacional del Litoral [Santa Fe] (UNL), Laboratory of Image Science and Technology [Nanjing] (LIST), Southeast University [Jiangsu]-School of Computer Science and Engineering, University of Western Ontario (UWO), The paper is partially supported by the Natural Science Foundation of China under Grants 61801296. The workof Eric Kerfoot was supported by an EPSRC programmeGrant (EP/P001009/1) and the Wellcome EPSRC Centre for Medical Engineering at the School of Biomedical Engineering and Imaging Sciences, Kings College London (WT203148/Z/16/Z). The work of Angelica Atehortua was supported by Colciencias-Colombia, Grant No. 647 (2015 call for National PhD studies) and Université de Rennes 1. The work of Alejandro Debus was supported by the Santa Fe Science, Technology and Innovation Agency (AS ACTEI), Government of the Province of Santa Fe, through Project AC-00010-18,Resolution N 117/14., University of Oxford, Université de Rennes (UR)-Institut National de la Santé et de la Recherche Médicale (INSERM), Université de Rennes (UR)-Southeast University [Jiangsu]-Institut National de la Santé et de la Recherche Médicale (INSERM), and Jonchère, Laurent
Automatic quantification of the left ventricle (LV) from cardiac magnetic resonance (CMR) images plays an important role in making the diagnosis procedure efficient, reliable, and alleviating the laborious reading work for physicians. Considerable efforts have been devoted to LV quantification using different strategies that include segmentation-based (SG) methods and the recent direct regression (DR) methods. Although both SG and DR methods have obtained great success for the task, a systematic platform to benchmark them remains absent because of differences in label information during model learning. In this paper, we conducted an unbiased evaluation and comparison of cardiac LV quantification methods that were submitted to the Left Ventricle Quantification (LVQuan) challenge, which was held in conjunction with the Statistical Atlases and Computational Modeling of the Heart (STACOM) workshop at the MICCAI 2018. The challenge was targeted at the quantification of 1) areas of LV cavity and myocardium, 2) dimensions of the LV cavity, 3) regional wall thicknesses (RWT), and 4) the cardiac phase, from mid-ventricle short-axis CMR images. First, we constructed a public quantification dataset Cardiac-DIG with ground truth labels for both the myocardium mask and these quantification targets across the entire cardiac cycle. Then, the key techniques employed by each submission were described. Next, quantitative validation of these submissions were conducted with the constructed dataset. The evaluation results revealed that both SG and DR methods can offer good LV quantification performance, even though DR methods do not require densely labeled masks for supervision. Among the 12 submissions, the DR method LDAMT offered the best performance, with a mean estimation error of 301 mm $^2$ for the two areas, 2.15 mm for the cavity dimensions, 2.03 mm for RWTs, and a 9.5% error rate for the cardiac phase classification. Three of the SG methods also delivered comparable performances. Finally, we discussed the advantages and disadvantages of SG and DR methods, as well as the unsolved problems in automatic cardiac quantification for clinical practice applications.