13 results on '"Schmidt, Heiko"'
Search Results
2. Long noncoding RNAs contribute to DNA damage resistance in Arabidopsis thaliana
- Author
-
Durut, Nathalie, Kornienko, Aleksandra E, Schmidt, Heiko A, Lettner, Nicole, Donà, Mattia, Nordborg, Magnus, and Mittelsten Scheid, Ortrun
- Abstract
Efficient repair of DNA lesions is essential for the faithful transmission of genetic information between somatic cells and for genome integrity across generations. Plants have multiple, partially redundant, and overlapping DNA repair pathways, probably due to the less constricted germline and the inevitable exposure to light including higher energy wavelengths. Many proteins involved in DNA repair and their mode of actions are well described. In contrast, a role for DNA damage-associated RNA components, evident from many other organisms, is less well understood. Here, we have challenged young Arabidopsis thalianaplants with two different types of genotoxic stress and performed de novo assembly and transcriptome analysis. We identified three long noncoding RNAs (lncRNAs) that are lowly or not expressed under regular conditions but up-regulated or induced by DNA damage. We generated CRISPR/Cas deletion mutants and found that the absence of the lncRNAs impairs the recovery capacity of the plants from genotoxic stress. The genetic loci are highly conserved among world-wide distributed Arabidopsis accessions and within related species in the Brassicaceaegroup. Together, these results suggest that the lncRNAs have a conserved function in connection with DNA damage and provide a basis for mechanistic analysis of their role.
- Published
- 2023
- Full Text
- View/download PDF
3. POSITIONIERUNG DES NEUEN SLK.
- Author
-
STEGMANN, BERND and SCHMIDT, HEIKO
- Published
- 2011
4. Refinement Sensitive Formal Semantics of State Machines With Persistent Choice.
- Author
-
Fecher, Harald, Huth, Michael, Schmidt, Heiko, and Schönborn, Jens
- Subjects
MACHINE theory ,SEMANTICS ,UNIFIED modeling language ,HUMAN-machine systems ,ARTIFICIAL intelligence - Abstract
Abstract: Modeling languages usually support two kinds of nondeterminism, an external one for interactions of a system with its environment, and one that stems from under-specification as familiar in models of behavioral requirements. Both forms of nondeterminism are resolvable by composing a system with an environment model and by refining under-specified behavior (respectively). Modeling languages usually don''t support nondeterminism that is persistent in that neither the composition with an environment nor refinements of under-specification will resolve it. Persistent nondeterminism is used, e.g., for modeling faulty systems. We present a formal semantics for UML state machines enriched with an operator “persistent choice” that models persistent nondeterminism. This semantics is based on abstract models –μ-automata with a novel refinement relation – and a sound three-valued satisfaction relation for properties expressed in the μ-calculus. [Copyright &y& Elsevier]
- Published
- 2009
- Full Text
- View/download PDF
5. Process Algebra Having Inherent Choice: Revised Semantics for Concurrent Systems.
- Author
-
Fecher, Harald and Schmidt, Heiko
- Subjects
ALGEBRA ,FORMALISM (Literary analysis) ,INFORMATION theory ,COMPUTER operating systems ,SEMANTICS - Abstract
Abstract: Process algebras are standard formalisms for compositionally describing systems by the dependencies of their observable synchronous communication. In concurrent systems, parallel composition introduces resolvable nondeterminism, i.e., nondeterminism that will be resolved in later design phases or by the operating system. Sometimes it is also important to express inherent nondeterminism for equal (communication) labels. Here, we give operational and axiomatic semantics to a process algebra having a parallel operator interpreted as concurrent and having a choice operator interpreted as inherent, not only w.r.t. different, but also w.r.t. equal next-step actions. In order to handle the different kinds of nondeterminism, the operational semantics uses μ-automata as underlying semantical model. Soundness and completeness of our axiom system w.r.t. the operational semantics is shown. [Copyright &y& Elsevier]
- Published
- 2007
- Full Text
- View/download PDF
6. In SituDetection of Tissue Factor within the Coronary Intima in Rat Cardiac Allograft Vasculopathy
- Author
-
Hölschermann, Hans, Bohle, Rainer M., Zeller, Hagen, Schmidt, Heiko, Stahl, Ulrich, Fink, Ludger, Grimm, Helmut, Tillmanns, Harald, and Haberbosch, Werner
- Abstract
Cardiac allograft vasculopathy is a major cause of morbidity and mortality of cardiac transplant recipients. The underlying cause of this disease remains unclear. Histological studies have implicated accelerated hemostasis and intravascular fibrin deposition in its pathogenesis. In the present study a defined model of this disease in the rat was used to elucidate the implication of tissue factor in the production of the hypercoagulable state observed in cardiac allograft vessels. Tissue factor protein and mRNA expression were studied in rat heart allografts developing allograft vasculopathy resembling human disease. Immunohistochemistry demonstrated tissue-factor-positive cells present in the allograft coronary intima and adventitia. Significant staining for tissue factor was detected in the endothelium lining coronary lesions in cardiac allografts and in interstitial mononuclear cells, respectively. Both transplant coronary endothelial cells and mononuclear cells contained tissue factor mRNA as indicated by oligo-cell reverse transcription polymerase chain reaction after laser-assisted cell picking. In contrast, tissue factor mRNA and protein were not or negligibly dectectable within the coronary intima of nontransplanted control hearts. Thus, the present study clearly demonstrates that aberrant tissue factor expression occurs within the coronary intima after cardiac transplantation. Tissue factor, activating downstream coagulation mechanisms, may account for the intravascular clotting abnormalities observed in cardiac allografts and may represent a key factor in transplant atherogenesis.
- Published
- 1999
- Full Text
- View/download PDF
7. Neuorientierungen im Iran.
- Author
-
Schmidt, Renate and Schmidt, Heiko
- Published
- 2007
8. TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing.
- Author
-
Schmidt, Heiko A, Strimmer, Korbinian, Vingron, Martin, and von Haeseler, Arndt
- Abstract
TREE-PUZZLE is a program package for quartet-based maximum-likelihood phylogenetic analysis (formerly PUZZLE, Strimmer and von Haeseler, Mol. Biol. Evol., 13, 964-969, 1996) that provides methods for reconstruction, comparison, and testing of trees and models on DNA as well as protein sequences. To reduce waiting time for larger datasets the tree reconstruction part of the software has been parallelized using message passing that runs on clusters of workstations as well as parallel computers.
- Published
- 2002
- Full Text
- View/download PDF
9. Viele Widersprüche -- wenig Hoffnung.
- Author
-
Schmidt, Heiko
- Published
- 2007
10. Axeldb: a Xenopus laevis database focusing on gene expression
- Author
-
Pollet, Nicolas, Schmidt, Heiko A., Gawantka, Volker, Vingron, Martin, and Niehrs, Christof
- Abstract
Axeldb is a database storing and integrating gene expression patterns and DNA sequences identified in a large-scale in situhybridization study in Xenopus laevis embryos. The data are organised in a format appropriate for comprehensive analysis, and enable comparison of images of expression pattern for any given set of genes. Information on literature, cDNA clones and their availability, nucleotide sequences, expression pattern and accompanying pictures are available. Current developments are aimed toward the interconnection with other databases and the integration of data from the literature. Axeldb is implemented using an ACEDB database system, and available through the web at http://www.dkfz-heidelberg.de/abt0135/axeldb.htm
- Published
- 2000
11. Maximum‐Likelihood Analysis Using TREE‐PUZZLE
- Author
-
Schmidt, Heiko A. and Haeseler, Arndt
- Abstract
TREE‐PUZZLE provides a means to analyze and reconstruct evolutionary relationships and trees based on quartets, i.e., groups of four sequences. Basic Protocol 1 explains how to reconstruct trees based on the maximum‐likelihood principle and quartet puzzling. Basic Protocol 2 discusses likelihood mapping, a method to visualize phylogenetic content in a multiple sequence alignment. Basic Protocol 3 explains how to compare tree topologies using different tests.
- Published
- 2007
- Full Text
- View/download PDF
12. pIQPNNI: parallel reconstruction of large maximum likelihood phylogenies
- Author
-
Minh, Bui Quang, Vinh, Le Sy, von Haeseler, Arndt, and Schmidt, Heiko A.
- Abstract
Summary: IQPNNI is a program to infer maximum-likelihood phylogenetic trees from DNA or protein data with a large number of sequences. We present an improved and MPI-parallel implementation showing very good scaling and speedup behavior. Availability: IQPNNI (
http://www.bi.uni-duesseldorf.de/software/iqpnni ) is written in C++, executable on UNIX/Linux, Windows and MacOS systems. (Free) MPI libraries can be found athttp://www.lam-mpi.org/mpi/implementations/ . Contact:haeseler@cs.uni-duesseldorf.de - Published
- 2005
- Full Text
- View/download PDF
13. Maximum-Likelihood Analysis Using TREE-PUZZLE
- Author
-
Schmidt, Heiko A. and Haeseler, Arndt
- Abstract
TREE-PUZZLE provides means to analyze and reconstruct evolutionary relationships and trees based on quartets, i.e. groups of 4 sequences. Reconstruct a Phylogenetic TreeThe main use of TREE-PUZZLE is to reconstruct phylogenetic trees from sequences. The example shows how to use TREE-PUZZLE to construct a tree from amino acid sequences assuming G-distributed rates across sites (UNIT).Necessary ResourcesHardwareTREE-PUZZLE runs on Windows, Macintosh computers, and Unix/Linux systems including workstation clusters and parallel computers using parallel computingSoftwareTREE-PUZZLE package (see Support Protocols 1to 3for information on how to obtain TREE-PUZZLE)FilesMultiple Sequence Alignment file in standard PHYLIP format. The sample data set used (EF.phy) here is included with the TREE-PUZZLE software and on the Current Protocols Web site (http://www3.interscience.wiley.com/c_p/cpbi_sampledatafiles.htm).Obtain and install TREE-PUZZLE (see Support Protocols 1to 3).Change to the datadirectory in the TREE-PUZZLE directory and start the program with the command puzzle EF.phy.Start puzzlein a terminal, e.g., MS DOS prompt (Windows) or xterm (Unix/Linux; APPENDIX; & APPENDIX), using the command puzzle alignmentfile, where alignmentfileis the name of the file containing the alignment to be analyzed, the example here is EF.phy. If puzzleis invoked from a filemanager or without a filename, it will search for a file called infilein the current directory. If infiledoes not exist, TREE-PUZZLE will ask for a filename. The alignmentfilehas to be in the current directory or the full path to its location must be given.Change the type of analysis to tree reconstruction(using the “b” key) and the tree search procedure to quartet puzzling(using the “k” key), if necessary (Fig. ).Flowchart of analysis type options in the TREE-PUZZLE menu. Options in TREE-PUZZLE are controlled by single letters. The flow chart shows the options that correspond to each letter. For example, entering the letter “b” toggles the analysis between tree reconstruction and likelihood mapping. Similarly, to choose among quartet puzzling, user defined trees, or pairwise distance matrices, enter the letter “k” until the desired option is shown on the screen.Adjust the outgroup to the sequence 22 EFG_MYCGE(using “o” and the number of the sequence).By default, the first sequence is used to root the resulting tree for output. However, the root has no impact on the log-likelihood.Note that the natural root lies between EF-a/Tu and EF-2/G (Iwabe et al., ). Hence the output tree has to be rerooted using a phylogeny viewer.For further discussion of selecting a tree root, see UNIT.Choose parameter estimation to be performed approximately (with “e”) using neighbor-joining trees (with “x”).Parameters are estimated using tree topologies. These are either inferred by neighbor-joining or given as usertree (usertree evaluation; see ). With the quartet samples + NJoption the evolutionary parameters are estimated on random quartet samples, neighbor-joining trees are only used for rate parameters. Approximateestimation uses pairwise distances to fit the branch lengths of the tree topologies, while ML branch lengths are inferred in the exactestimation.Choose a model of evolutionChange the type of sequence data to amino acids (using “d”) if the automatically assigned type is not correct (Fig. ).Flowchart of substitution model options in the TREE-PUZZLE menu.Using the character composition of the alignment, TREE-PUZZLE tries to figure out whether the type of data is nucleotide, protein, or binary data.Choose an appropriate model of sequence evolution to analyze the dataset. For the example alignment, choose the VT model by entering “m” five times (Fig. ).Several models for protein evolution are implemented in TREE-PUZZLE. While the models by Dayhoff et al. () and Jones et al. () are universal models created from different protein families, more specific models are available, e.g., the mtREV24 model by Adachi and Hasegawa () for mitochondrial protein sequences, whereas the VT (Müller and Vingron, ) and the WAG models (Whelan and Goldman, ) are suited to analyze distantly related sequences. The BLOSUM62 matrix (Henikoff and Henikoff, ; UNIT) was designed for database searches and thus should be used with caution for the analysis of evolutionary relationships.For DNA (Fig. ), the HKY (Hasegawa et al., ) and TN (Tamura and Nei, ) models are available. Those models can be restricted to simpler models like JC (Jukes and Cantor, ), K2P (Kimura, ), or F84 (Felsenstein, ) by setting substitution parameters accordingly (refer to the manual and UNITS& for further details). Additionally, the SH nucleotide doublet model (Schöniger and von Haeseler, ) and a binary model based on the model of Felsenstein () are implemented in TREE-PUZZLE.Flowchart of further substitution model parameters in the TREE-PUZZLE menu.Choose gamma-distributed rate heterogeneity model by typing “w” (Fig. ).Flowchart of rate heterogeneity options in the TREE-PUZZLE menu.It is known that positions in an alignment do not evolve with the same evolutionary rates, typically attributed to selective pressure or other functional constraints acting on positions of the sequence. In such cases, the assumption of rate heterogeneity can improve the estimation of the branch lengths.Three different models of rate heterogeneity are implemented in TREE-PUZZLE. Besides gamma-distributed rates, there is the two-rates model that assumes a fraction of the positions to be invariable and a mixed model that considers the variable sites to evolve according to a gamma distribution. The amount of rate heterogeneity of the gamma-distributed rates is described by the shape parameter a, where a <1 describes strong heterogeneity, while large values describe homogeneity (for more details, refer to Gu et al., ; Page and Holmes, ; UNITS& ).If tree reconstructions with and without the assumption of rate heterogeneity construct different trees, those trees can be compared as described in to find out whether the resulting tree topologies are significantly different.Set the list puzzling step treesoption to unique topologieswith the “j” key, to make TREE-PUZZLE write all (unique) intermediate tree topologies to file (EF.phy.ptorderor outptorder). When doing one's own analysis, it might be necessary to change other parameters.Many other parameters and options can be set manually. For instance, it is possible to specify the amino acid or nucleotide composition. Figures to summarize all options currently available in TREE-PUZZLE. More details are given in the manual.Flowchart of parameter estimation options in the TREE-PUZZLE menu.Start analysis by typing “y”.TREE-PUZZLE will now perform a tree reconstruction. During its run, it will indicate which steps are performed: first the missing parameters are estimated, then all possible quartet maximum-likelihood trees are computed, which are subsequently used to compute intermediate quartet puzzling trees. Finally, the likelihood and the branch lengths of the consensus tree are computed (Fig. ).TREE-PUZZLE menu setting and screen output from tree reconstruction.Examine the resultsExamine the puzzle report file. The report file is called EF.phy.puzzleif the name of the alignment file was entered on the command line when the program was executed (e.g., puzzle EF.phy). Otherwise, the report is called outfile.The puzzle report file presents the quality of the data as well as the reconstructed tree. Hence, it should be thoroughly examined (see below).Examine the reconstructed tree by viewing the tree file EF.phy.tree(or outtree, Fig. ) using a tree drawing program like TreeView or TreeTool (see UNITand Internet Resourcesbelow).Phylogenetic tree reconstructed from the EF.phydataset as described in . The tree is rooted by the duplication event between EF-2/G and EF-1a/Tu.If a program cannot read such trees, it may be necessary to remove the leading comment (bordered by square brackets).explains how to reconstruct trees based on the maximum-likelihood principle and quartet puzzling. Analyze the Content of Phylogenetic Information and the Quartet Support for the Relationship of Groups of SequencesLikelihood mapping provides the opportunity to either check the content of phylogenetic information in an alignment or estimate the quartet support of relationships among groups of sequences. The former visualizes whether the data is suitable for phylogenetic analysis by measuring the resolution of the quartet topologies, trees of four sequences. This check should be run especially for large datasets to avoid spending days or maybe even weeks for phylogenetic analysis with data that have little phylogenetic information. For the latter method, one partitions a dataset into sets of two to four clusters. Likelihood mapping visualizes which of the possible relationships between these clusters is most supported by the reconstructed quartet tree topologies (Fig. ). This method is also useful for reducing the runtime if the goal is to examine one special bipartition of a tree in a large dataset. The EF data (Table ) will serve as an example. First, the suitability of the alignment for phylogenetic analysis is measured (step ). Second, the relationship of four subsets of the dataset (step ) is studied in more detail.Necessary ResourcesHardwareTREE-PUZZLE runs on Windows and Macintosh computers as well as Unix/Linux systems including workstation clusters and parallel computers using parallel computingSoftwareTREE-PUZZLE package (see Support Protocols 1to 3for information on how to obtain TREE-PUZZLE)FilesMultiple Sequence Alignment file in standard PHYLIP format. The sample data set used here (EF.phy) is included with the TREE-PUZZLE software and on the Current Protocols Web site (http://www3.interscience.wiley.com/c_p/cpbi_sampledatafiles.htm).Obtain and install TREE-PUZZLE (see Support Protocols 1to 3).Change to the datadirectory in the TREE-PUZZLEdirectory and start puzzle with the command puzzle EF.phy.Start puzzlein a terminal, e.g., MS DOS prompt (Windows) or xterm (Unix/Linux; APPENDIX& APPENDIX) using the command puzzle alignmentfile, where alignmentfileis the name of the file containing the alignment to be analyzed. If puzzleis invoked from a filemanager or without a filename, it will search for a file called infilein the current directory. If infiledoes not exist, TREE-PUZZLE will ask for a filename. The alignmentfilehas to be in the current directory or the full path to its location must be given.Change the type of analysis to Likelihood mapping(using the “b” key).4aLeave the sequences ungrouped for a general likelihood mapping analysis to test the dataset.4bGroup the sequences into four clusters (using “g”). Assign crenarchaeotic EF-2 to cluster a, bacterial EF-G to b, eucaryotic EF-2 to c, and all EF-1a/Tu sequences to cluster d(Table ).To analyze the phylogenetic content among clusters define two to four disjoint sets of sequences from the alignment by assigning each sequence the name of the cluster a, b, c, or d(in the case of less than four clusters, cand/or dare not valid). Assigning xwill exclude a sequence from the analysis. Each sequence must be labeled a, b, (c, d), or x.A two-cluster analysis will check for the quartet support for bipartition into the two clusters, whereas a four-cluster analysis will infer the quartet support for any of the three possible relationships of the four clusters, namely (ab|cd), (ac|bd), or (ad|bc). Where “|” denotes the inner branch that separates the groups (Fig. ).Choose a model of evolution (for more information, see , steps to )Change the type of sequence data (using “d”) if the automatically assigned type is wrong.TREE-PUZZLE should have set the data type correctly to amino acids for the example.Choose an appropriate model of evolution to analyze a dataset. For the example alignment, choose the VT model by entering “m” five times.Choose rate heterogeneity model by typing “w”.Change other parameters, if necessary. For the example, leave the parameters unchanged.The number of quartets used in the analysis can be set by the noption. If the number of existing quartets is larger than the specified number, a random subset of all possible quartets is chosen by default, but the size of the sample is also adjustable.Start analysis by typing “y”.TREE-PUZZLE will now perform a likelihood-mapping analysis. During the run, it will indicate which steps are performed: first the missing parameters are estimated, then the likelihood-mapping analysis is performed evaluating quartet maximum-likelihood trees. For large datasets, a random subset of quartets is analyzed (Fig. ).TREE-PUZZLE menu setting and screen output from likelihood-mapping analysis.Examine the resultsExamine the puzzle report file. The report file is called EF.phy.puzzle, if starting with the alignment file from the command line (e.g., puzzle.EF.phy), or outfileif entering the alignment file manually.The puzzle report file presents the quality of the data as well as the results of the likelihood mapping. Hence, it should be thoroughly examined.Examine the likelihood-mapping diagram (Figs. , , and ) EF.phy.eps(or outlm.eps) using a PostScript browser like ghostscript/ghostview (see Internet Resources).How likelihood weights are plotted in a likelihood-mapping diagram. Left side: likelihood weight plotted in a three-dimensional coordinate system. Right side: the simplex and its areas and the corresponding quartet topologies. The gray triangles are identical, only viewed from different angles.Likelihood-mapping diagram visualizing the phylogenetic content of the EF.phydataset performed as described in .Likelihood-mapping diagram visualizing the support for a Crenarchaeota-Eucaryota sister group in the EF-2/G genes of the EF.phydataset as described in .discusses likelihood mapping, a method to visualize phylogenetic content in a multiple sequence alignment. Compare Tree TopologiesA third type of analysis implemented in TREE-PUZZLE is the likelihood-based comparison of two or more tree topologies using the tests suggested by Kishino and Hasegawa (), Shimodaira and Hasegawa (), and the so-called expected likelihood weights (Strimmer and Rambaut, ). These tests compare different trees to evaluate something like a confidence set of trees. The example used here is a dataset together with a set of trees with different branching patterns, comprising the tree reconstructed in and two trees with the different possible relationships of Crenarchaeota, Bacteria, and Eucaryota (Fig. ).The three tree topologies used in the usertree comparison. (A) Tree 1: Eucaryota-Crenarchaeota sister groups, (B) Tree 2: Bacteria-Crenarchaeota sister groups, (C) Tree 3: Eucaryota-Bacteria sister groups. The tree topologies are used without branchlengths.Necessary ResourcesHardwareTREE-PUZZLE runs on Windows and Macintosh computers as well as Unix/Linux systems including workstation clusters and parallel computers using parallel computingSoftwareTREE-PUZZLE package (see Support Protocols 1to 3for information on how to obtain TREE-PUZZLE)FilesMultiple Sequence Alignment file in standard PHYLIP format. A tree file containing the usertrees in PHYLIP tree format as produced by many programs like PHYLIP, TREE-PUZZLE, etc. (trees can span several lines and contain comments; for more information see UNIT); see file EF.3treeson the Current Protocols Web site at the URL below. The sample data set used below is included with the TREE-PUZZLE software and on the Current Protocols Web site (http://www3.interscience.wiley.com/c_p/cpbi_sampledatafiles.htm)Obtain and install TREE-PUZZLE (see Support Protocols 1to 3).Change to the datadirectory in the TREE-PUZZLEdirectory and start puzzle with the command puzzle EF.phy EF.3trees.Start puzzlein a terminal, e.g., MS DOS prompt (Windows) or xterm (Unix/Linux; APPENDIX& APPENDIX) using the command puzzle alignmentfile usertreefile, where alignmentfileis the name of the file containing the alignment to be analyzed and usertreefileis the name of the file that contains the tree topologies for comparison. If puzzleis invoked from a filemanager or without filenames, it will search for the files infileand intreein the current directory. If infileand/or intreedoes not exist, TREE-PUZZLE will ask for a filename. The alignmentfile and usertreefile have to be in the current directory or the full paths to their respective locations must be given.Change the type of analysis to tree reconstruction(using the “b” key) and the tree search procedure to user defined trees(using the “k” key), if necessary.Adjust the outgroup if necessary (using “o”). By default, the first sequence is used to root the resulting tree for output.Choose a model of evolution (for more information, see , steps to )Change the type of sequence data (using “d”) if the automatically assigned type is wrong. TREE-PUZZLE should have set the data type correctly to amino acids for the example.Choose an appropriate model of evolution to analyze the dataset. For this example alignment, choose the VT model by entering “m” five times.Choose rate heterogeneity model by typing “w”.Choose neighbor-joining (NJ) tree as the means for the parameter estimation with the “x” key. Change other parameters, if necessary.For tree evaluation, TREE-PUZZLE uses the first usertree for the parameter estimation by default. This makes sense for the evaluation of single trees, but to test a set of trees like in this example, a NJ tree should be used to estimate the parameters.Start analysis by typing “y”.TREE-PUZZLE will now evaluate and compare the tree topologies in the usertreefile (EF.3trees). During its run, it will indicate which steps are performed: first, the missing parameters are estimated, then all trees in the usertreefile (EF.3trees) are evaluated and the results are written to the puzzle report file (Fig. ).TREE-PUZZLE menu setting and screen output from usertree evaluation.Examine the resultsExamine the puzzle report file. The report file is called EF.3trees.puzzle, if starting with the alignment file from the command line, or outfile, if entering the alignment file manually.The puzzle report file presents the quality of the data as well as the results of the usertree evaluation (Fig. ). Hence, it should be thoroughly examined. The file EF.3trees.tree(or outtree) contains each tree from the usertreefile in NEWICK tree format with estimated branch lengths. The trees can be viewed with tree drawing programs like TreeView or TreeTool (see UNITand Internet Resources). If a program cannot read such trees, it might be necessary to remove the leading comment (bordered by square brackets).Results of the comparison of four trees from the EF.phydataset as described in .explains how to compare tree topologies using different tests. The main use of TREE-PUZZLE is to reconstruct phylogenetic trees from sequences. The example shows how to use TREE-PUZZLE to construct a tree from amino acid sequences assuming G-distributed rates across sites (UNIT). Necessary Resources Hardware TREE-PUZZLE runs on Windows, Macintosh computers, and Unix/Linux systems including workstation clusters and parallel computers using parallel computing TREE-PUZZLE runs on Windows, Macintosh computers, and Unix/Linux systems including workstation clusters and parallel computers using parallel computing Software TREE-PUZZLE package (see Support Protocols 1to 3for information on how to obtain TREE-PUZZLE) TREE-PUZZLE package (see Support Protocols 1to 3for information on how to obtain TREE-PUZZLE) Files Multiple Sequence Alignment file in standard PHYLIP format. The sample data set used (EF.phy) here is included with the TREE-PUZZLE software and on the Current Protocols Web site (http://www3.interscience.wiley.com/c_p/cpbi_sampledatafiles.htm). Multiple Sequence Alignment file in standard PHYLIP format. The sample data set used (EF.phy) here is included with the TREE-PUZZLE software and on the Current Protocols Web site (http://www3.interscience.wiley.com/c_p/cpbi_sampledatafiles.htm). Obtain and install TREE-PUZZLE (see Support Protocols 1to 3). Change to the datadirectory in the TREE-PUZZLE directory and start the program with the command puzzle EF.phy. Start puzzlein a terminal, e.g., MS DOS prompt (Windows) or xterm (Unix/Linux; APPENDIX; & APPENDIX), using the command puzzle alignmentfile, where alignmentfileis the name of the file containing the alignment to be analyzed, the example here is EF.phy. If puzzleis invoked from a filemanager or without a filename, it will search for a file called infilein the current directory. If infiledoes not exist, TREE-PUZZLE will ask for a filename. The alignmentfilehas to be in the current directory or the full path to its location must be given. Change the type of analysis to tree reconstruction(using the “b” key) and the tree search procedure to quartet puzzling(using the “k” key), if necessary (Fig. ). Flowchart of analysis type options in the TREE-PUZZLE menu. Options in TREE-PUZZLE are controlled by single letters. The flow chart shows the options that correspond to each letter. For example, entering the letter “b” toggles the analysis between tree reconstruction and likelihood mapping. Similarly, to choose among quartet puzzling, user defined trees, or pairwise distance matrices, enter the letter “k” until the desired option is shown on the screen. Adjust the outgroup to the sequence 22 EFG_MYCGE(using “o” and the number of the sequence). By default, the first sequence is used to root the resulting tree for output. However, the root has no impact on the log-likelihood. Note that the natural root lies between EF-a/Tu and EF-2/G (Iwabe et al., ). Hence the output tree has to be rerooted using a phylogeny viewer. For further discussion of selecting a tree root, see UNIT. Choose parameter estimation to be performed approximately (with “e”) using neighbor-joining trees (with “x”). Parameters are estimated using tree topologies. These are either inferred by neighbor-joining or given as usertree (usertree evaluation; see ). With the quartet samples + NJoption the evolutionary parameters are estimated on random quartet samples, neighbor-joining trees are only used for rate parameters. Approximateestimation uses pairwise distances to fit the branch lengths of the tree topologies, while ML branch lengths are inferred in the exactestimation. Change the type of sequence data to amino acids (using “d”) if the automatically assigned type is not correct (Fig. ). Flowchart of substitution model options in the TREE-PUZZLE menu. Using the character composition of the alignment, TREE-PUZZLE tries to figure out whether the type of data is nucleotide, protein, or binary data. Choose an appropriate model of sequence evolution to analyze the dataset. For the example alignment, choose the VT model by entering “m” five times (Fig. ). Several models for protein evolution are implemented in TREE-PUZZLE. While the models by Dayhoff et al. () and Jones et al. () are universal models created from different protein families, more specific models are available, e.g., the mtREV24 model by Adachi and Hasegawa () for mitochondrial protein sequences, whereas the VT (Müller and Vingron, ) and the WAG models (Whelan and Goldman, ) are suited to analyze distantly related sequences. The BLOSUM62 matrix (Henikoff and Henikoff, ; UNIT) was designed for database searches and thus should be used with caution for the analysis of evolutionary relationships. For DNA (Fig. ), the HKY (Hasegawa et al., ) and TN (Tamura and Nei, ) models are available. Those models can be restricted to simpler models like JC (Jukes and Cantor, ), K2P (Kimura, ), or F84 (Felsenstein, ) by setting substitution parameters accordingly (refer to the manual and UNITS& for further details). Additionally, the SH nucleotide doublet model (Schöniger and von Haeseler, ) and a binary model based on the model of Felsenstein () are implemented in TREE-PUZZLE. Flowchart of further substitution model parameters in the TREE-PUZZLE menu. Choose gamma-distributed rate heterogeneity model by typing “w” (Fig. ). Flowchart of rate heterogeneity options in the TREE-PUZZLE menu. It is known that positions in an alignment do not evolve with the same evolutionary rates, typically attributed to selective pressure or other functional constraints acting on positions of the sequence. In such cases, the assumption of rate heterogeneity can improve the estimation of the branch lengths. Three different models of rate heterogeneity are implemented in TREE-PUZZLE. Besides gamma-distributed rates, there is the two-rates model that assumes a fraction of the positions to be invariable and a mixed model that considers the variable sites to evolve according to a gamma distribution. The amount of rate heterogeneity of the gamma-distributed rates is described by the shape parameter a, where a <1 describes strong heterogeneity, while large values describe homogeneity (for more details, refer to Gu et al., ; Page and Holmes, ; UNITS& ). If tree reconstructions with and without the assumption of rate heterogeneity construct different trees, those trees can be compared as described in to find out whether the resulting tree topologies are significantly different. Set the list puzzling step treesoption to unique topologieswith the “j” key, to make TREE-PUZZLE write all (unique) intermediate tree topologies to file (EF.phy.ptorderor outptorder). When doing one's own analysis, it might be necessary to change other parameters. Many other parameters and options can be set manually. For instance, it is possible to specify the amino acid or nucleotide composition. Figures to summarize all options currently available in TREE-PUZZLE. More details are given in the manual. Flowchart of parameter estimation options in the TREE-PUZZLE menu. Start analysis by typing “y”. TREE-PUZZLE will now perform a tree reconstruction. During its run, it will indicate which steps are performed: first the missing parameters are estimated, then all possible quartet maximum-likelihood trees are computed, which are subsequently used to compute intermediate quartet puzzling trees. Finally, the likelihood and the branch lengths of the consensus tree are computed (Fig. ). TREE-PUZZLE menu setting and screen output from tree reconstruction. Examine the puzzle report file. The report file is called EF.phy.puzzleif the name of the alignment file was entered on the command line when the program was executed (e.g., puzzle EF.phy). Otherwise, the report is called outfile. The puzzle report file presents the quality of the data as well as the reconstructed tree. Hence, it should be thoroughly examined (see below). Examine the reconstructed tree by viewing the tree file EF.phy.tree(or outtree, Fig. ) using a tree drawing program like TreeView or TreeTool (see UNITand Internet Resourcesbelow). Phylogenetic tree reconstructed from the EF.phydataset as described in . The tree is rooted by the duplication event between EF-2/G and EF-1a/Tu. If a program cannot read such trees, it may be necessary to remove the leading comment (bordered by square brackets). Likelihood mapping provides the opportunity to either check the content of phylogenetic information in an alignment or estimate the quartet support of relationships among groups of sequences. The former visualizes whether the data is suitable for phylogenetic analysis by measuring the resolution of the quartet topologies, trees of four sequences. This check should be run especially for large datasets to avoid spending days or maybe even weeks for phylogenetic analysis with data that have little phylogenetic information. For the latter method, one partitions a dataset into sets of two to four clusters. Likelihood mapping visualizes which of the possible relationships between these clusters is most supported by the reconstructed quartet tree topologies (Fig. ). This method is also useful for reducing the runtime if the goal is to examine one special bipartition of a tree in a large dataset. The EF data (Table ) will serve as an example. First, the suitability of the alignment for phylogenetic analysis is measured (step ). Second, the relationship of four subsets of the dataset (step ) is studied in more detail. Necessary Resources Hardware TREE-PUZZLE runs on Windows and Macintosh computers as well as Unix/Linux systems including workstation clusters and parallel computers using parallel computing TREE-PUZZLE runs on Windows and Macintosh computers as well as Unix/Linux systems including workstation clusters and parallel computers using parallel computing Software TREE-PUZZLE package (see Support Protocols 1to 3for information on how to obtain TREE-PUZZLE) TREE-PUZZLE package (see Support Protocols 1to 3for information on how to obtain TREE-PUZZLE) Files Multiple Sequence Alignment file in standard PHYLIP format. The sample data set used here (EF.phy) is included with the TREE-PUZZLE software and on the Current Protocols Web site (http://www3.interscience.wiley.com/c_p/cpbi_sampledatafiles.htm). Multiple Sequence Alignment file in standard PHYLIP format. The sample data set used here (EF.phy) is included with the TREE-PUZZLE software and on the Current Protocols Web site (http://www3.interscience.wiley.com/c_p/cpbi_sampledatafiles.htm). Obtain and install TREE-PUZZLE (see Support Protocols 1to 3). Change to the datadirectory in the TREE-PUZZLEdirectory and start puzzle with the command puzzle EF.phy. Start puzzlein a terminal, e.g., MS DOS prompt (Windows) or xterm (Unix/Linux; APPENDIX& APPENDIX) using the command puzzle alignmentfile, where alignmentfileis the name of the file containing the alignment to be analyzed. If puzzleis invoked from a filemanager or without a filename, it will search for a file called infilein the current directory. If infiledoes not exist, TREE-PUZZLE will ask for a filename. The alignmentfilehas to be in the current directory or the full path to its location must be given. Change the type of analysis to Likelihood mapping(using the “b” key). Leave the sequences ungrouped for a general likelihood mapping analysis to test the dataset. Group the sequences into four clusters (using “g”). Assign crenarchaeotic EF-2 to cluster a, bacterial EF-G to b, eucaryotic EF-2 to c, and all EF-1a/Tu sequences to cluster d(Table ). To analyze the phylogenetic content among clusters define two to four disjoint sets of sequences from the alignment by assigning each sequence the name of the cluster a, b, c, or d(in the case of less than four clusters, cand/or dare not valid). Assigning xwill exclude a sequence from the analysis. Each sequence must be labeled a, b, (c, d), or x. A two-cluster analysis will check for the quartet support for bipartition into the two clusters, whereas a four-cluster analysis will infer the quartet support for any of the three possible relationships of the four clusters, namely (ab|cd), (ac|bd), or (ad|bc). Where “|” denotes the inner branch that separates the groups (Fig. ). Change the type of sequence data (using “d”) if the automatically assigned type is wrong. TREE-PUZZLE should have set the data type correctly to amino acids for the example. Choose an appropriate model of evolution to analyze a dataset. For the example alignment, choose the VT model by entering “m” five times. Choose rate heterogeneity model by typing “w”. Change other parameters, if necessary. For the example, leave the parameters unchanged. The number of quartets used in the analysis can be set by the noption. If the number of existing quartets is larger than the specified number, a random subset of all possible quartets is chosen by default, but the size of the sample is also adjustable. Start analysis by typing “y”. TREE-PUZZLE will now perform a likelihood-mapping analysis. During the run, it will indicate which steps are performed: first the missing parameters are estimated, then the likelihood-mapping analysis is performed evaluating quartet maximum-likelihood trees. For large datasets, a random subset of quartets is analyzed (Fig. ). TREE-PUZZLE menu setting and screen output from likelihood-mapping analysis. Examine the puzzle report file. The report file is called EF.phy.puzzle, if starting with the alignment file from the command line (e.g., puzzle.EF.phy), or outfileif entering the alignment file manually. The puzzle report file presents the quality of the data as well as the results of the likelihood mapping. Hence, it should be thoroughly examined. Examine the likelihood-mapping diagram (Figs. , , and ) EF.phy.eps(or outlm.eps) using a PostScript browser like ghostscript/ghostview (see Internet Resources). How likelihood weights are plotted in a likelihood-mapping diagram. Left side: likelihood weight plotted in a three-dimensional coordinate system. Right side: the simplex and its areas and the corresponding quartet topologies. The gray triangles are identical, only viewed from different angles. Likelihood-mapping diagram visualizing the phylogenetic content of the EF.phydataset performed as described in . Likelihood-mapping diagram visualizing the support for a Crenarchaeota-Eucaryota sister group in the EF-2/G genes of the EF.phydataset as described in . A third type of analysis implemented in TREE-PUZZLE is the likelihood-based comparison of two or more tree topologies using the tests suggested by Kishino and Hasegawa (), Shimodaira and Hasegawa (), and the so-called expected likelihood weights (Strimmer and Rambaut, ). These tests compare different trees to evaluate something like a confidence set of trees. The example used here is a dataset together with a set of trees with different branching patterns, comprising the tree reconstructed in and two trees with the different possible relationships of Crenarchaeota, Bacteria, and Eucaryota (Fig. ). The three tree topologies used in the usertree comparison. (A) Tree 1: Eucaryota-Crenarchaeota sister groups, (B) Tree 2: Bacteria-Crenarchaeota sister groups, (C) Tree 3: Eucaryota-Bacteria sister groups. The tree topologies are used without branchlengths. Necessary Resources Hardware TREE-PUZZLE runs on Windows and Macintosh computers as well as Unix/Linux systems including workstation clusters and parallel computers using parallel computing TREE-PUZZLE runs on Windows and Macintosh computers as well as Unix/Linux systems including workstation clusters and parallel computers using parallel computing Software TREE-PUZZLE package (see Support Protocols 1to 3for information on how to obtain TREE-PUZZLE) TREE-PUZZLE package (see Support Protocols 1to 3for information on how to obtain TREE-PUZZLE) Files Multiple Sequence Alignment file in standard PHYLIP format. A tree file containing the usertrees in PHYLIP tree format as produced by many programs like PHYLIP, TREE-PUZZLE, etc. (trees can span several lines and contain comments; for more information see UNIT); see file EF.3treeson the Current Protocols Web site at the URL below. The sample data set used below is included with the TREE-PUZZLE software and on the Current Protocols Web site (http://www3.interscience.wiley.com/c_p/cpbi_sampledatafiles.htm) Multiple Sequence Alignment file in standard PHYLIP format. A tree file containing the usertrees in PHYLIP tree format as produced by many programs like PHYLIP, TREE-PUZZLE, etc. (trees can span several lines and contain comments; for more information see UNIT); see file EF.3treeson the Current Protocols Web site at the URL below. The sample data set used below is included with the TREE-PUZZLE software and on the Current Protocols Web site (http://www3.interscience.wiley.com/c_p/cpbi_sampledatafiles.htm) Obtain and install TREE-PUZZLE (see Support Protocols 1to 3). Change to the datadirectory in the TREE-PUZZLEdirectory and start puzzle with the command puzzle EF.phy EF.3trees. Start puzzlein a terminal, e.g., MS DOS prompt (Windows) or xterm (Unix/Linux; APPENDIX& APPENDIX) using the command puzzle alignmentfile usertreefile, where alignmentfileis the name of the file containing the alignment to be analyzed and usertreefileis the name of the file that contains the tree topologies for comparison. If puzzleis invoked from a filemanager or without filenames, it will search for the files infileand intreein the current directory. If infileand/or intreedoes not exist, TREE-PUZZLE will ask for a filename. The alignmentfile and usertreefile have to be in the current directory or the full paths to their respective locations must be given. Change the type of analysis to tree reconstruction(using the “b” key) and the tree search procedure to user defined trees(using the “k” key), if necessary. Adjust the outgroup if necessary (using “o”). By default, the first sequence is used to root the resulting tree for output. Change the type of sequence data (using “d”) if the automatically assigned type is wrong. TREE-PUZZLE should have set the data type correctly to amino acids for the example. Choose an appropriate model of evolution to analyze the dataset. For this example alignment, choose the VT model by entering “m” five times. Choose rate heterogeneity model by typing “w”. Choose neighbor-joining (NJ) tree as the means for the parameter estimation with the “x” key. Change other parameters, if necessary. For tree evaluation, TREE-PUZZLE uses the first usertree for the parameter estimation by default. This makes sense for the evaluation of single trees, but to test a set of trees like in this example, a NJ tree should be used to estimate the parameters. Start analysis by typing “y”. TREE-PUZZLE will now evaluate and compare the tree topologies in the usertreefile (EF.3trees). During its run, it will indicate which steps are performed: first, the missing parameters are estimated, then all trees in the usertreefile (EF.3trees) are evaluated and the results are written to the puzzle report file (Fig. ). TREE-PUZZLE menu setting and screen output from usertree evaluation. Examine the puzzle report file. The report file is called EF.3trees.puzzle, if starting with the alignment file from the command line, or outfile, if entering the alignment file manually. The puzzle report file presents the quality of the data as well as the results of the usertree evaluation (Fig. ). Hence, it should be thoroughly examined. The file EF.3trees.tree(or outtree) contains each tree from the usertreefile in NEWICK tree format with estimated branch lengths. The trees can be viewed with tree drawing programs like TreeView or TreeTool (see UNITand Internet Resources). If a program cannot read such trees, it might be necessary to remove the leading comment (bordered by square brackets). Results of the comparison of four trees from the EF.phydataset as described in .
- Published
- 2003
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.