Telling species apart plays a key role in understanding global biodiversity, monitoring change, and managing biodiversity. However, species discrimination is often difficult due to the sheer volume of species on earth and the complexity of the nature of species. This led to the development of DNA barcoding which uses standardised regions of DNA for species identification. The standard plant DNA barcodes are based on small regions from the plastid genome and ribosomal DNA. The approach works well in some plant groups, but does not provide unique species-level resolution in many others. In this thesis, I explore the potential for using data from the nuclear genome for improving and enhancing species discrimination in plants. Access to the nuclear genome via high-throughput sequencing now enables the generation of large amounts of sequence data from multiple unlinked loci. Such data offer the opportunity for designing the next generation of plant barcoding approaches based on a detailed understanding of genomic differences between species. Various studies have shown the ability of multiple unlinked nuclear markers to provide high discriminatory power in many plant groups, separating species, and infra-specific taxa. But there has not yet been an overview and synthesis of exactly how powerful these approaches can be, and how best to guide future efforts in building plant identification tools. In this thesis, I provide a first overview of how nuclear genomes perform in telling species apart. Overall I tackled the following questions 1) what is the proportion of species that are distinguishable with nuclear markers? 2) what is the nature of the inter-specific differences and what are the attributes of loci that are the most informative in telling species apart? And 3) how many markers are needed and what markers are needed to maximise the species identification success? To answer those questions, I first outlined the conceptual issues to address in assembling and analysing appropriate multi-locus nuclear sequence datasets. I then developed a new pipeline called NucBarcoder which supports workflows and analysis using multi-locus nuclear sequence data for species discrimination. I then tested this workflow on a case study, consisting of a dataset of sequence data from 810 nuclear genes from 453 individuals from 133 Inga species including 69 species which were represented by multiple-sampled individuals. Of the 69 species with multiple individuals sampled, 45 resolved as monophyletic (65%). The density of species- specific SNPs for each Inga species ranged from 0 to 1503 per megabase. Compared to the full dataset of 810 genes and 205,871 SNPs, subsampling analysis revealed that a random selection of 70 genes or 2500 SNPs, or a combination of 9 'best performing' genes could achieve levels of species discrimination success similar to the full dataset. I found a positive correlation (r = 0.42) between the number of species distinguished and the nucleotide diversity of the genes used for species discrimination. To search for broader generalisations, I then compiled data from 149 different genera to assess the proportion of plant species that resolve as monophyletic. I then selected 29 genera with suitable available data for further study and calculated the abundance and density of species-specific SNPs (SSSNPs), and the proportion of species that can be distinguished by different subsets of the data and also by targeting the best- performing gene regions. Finally, I evaluated the characteristics of the best-performing gene regions in terms of levels of nucleotide diversity and density of SSSNPs. In the II analysis of 149 genera, overall, of the 1,701 multiple-sampled species evaluated 1,206 resolved as monophyletic (71%). At the level of individual genera, 37 of the 149 genera (25.8%) had 100% of species resolved as monophyletic, and 75 (50.3%) genera had at least 75% of the species resolving as monophyletic. Among the 29 genera examined in more detail, the density of SSSNPs of all species ranged from 0 to 27,262 per Mb, with a median density of 323 SSSNPs per Mb (a median density of one SSSNP every 3,098 bp). Of the total of 460 species from 29 genera assessed, 411 species (89.3%) had at least one SSSNP. Resampling of these datasets showed that with around 3,000 SNPs, almost all genera have asymptoted in their levels of species discrimination, with 21/23 genera (91%) having >85% of their distinguishable species (e.g. those told apart in the full data set) distinguished with 3,000 randomly selected SNPs. Furthermore, in a detailed investigation of six genera, there are clearly some loci that are much better than others in telling species apart. Between one and nine pre-selected genes were able to recover equivalent levels of species discrimination compared to several hundred genes from the full datasets. A closer investigation of the attributes of the best-performing genes showed some positive correlations between the number of species resolved as monophyletic and the nucleotide diversity of a given gene, although this relationship was not clear cut, and the genes that give the highest species resolution are not always the most variable genes. These findings provide some key general information on the proportions of plant species that are resolvable using multi-locus nuclear sequence data from plants and the nature of the sequence variation between plant species. In the final chapter of the thesis, I summarise these findings and identify a set of priority research and infrastructure needs to take forward the development and use of multi-locus nuclear DNA barcoding of plants.