Benedict Paten, Mirana Ramialison, Jerico Revote, Louis T. Dang, Julian Stolper, Vincent Tano, Man Ho H. Chiu, Marie A. Bogoyevitch, Markus Tondl, Mark J. Drvodelic, Fernando J. Rossello, Hieu T. Nim, Michael P. Eichenlaub, Jeannette C. Hallab, David A. Jans, Florence Besse, James E. Hudson, Alex Tokolyi, Greg Quaife-Ryan, Enzo R. Porrello, Helen E. Cumming, Australian Regenerative Medicine Institute, Monash University, Clayton, 3800, VIC, Australia, eResearch, Santa Cruz Genomics Institute, University of California [Santa Cruz] (UCSC), University of California-University of California, Department of Biochemistry and Molecular Biology, The University of MelbourneParkville, VIC, Australia., Institut de Biologie Valrose (IBV), Université Nice Sophia Antipolis (... - 2019) (UNS), COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Université Côte d'Azur (UCA)-Centre National de la Recherche Scientifique (CNRS), School of Biomedical Sciences, The University of Queensland, Brisbane, QLD, 4072, Australia., Hudson Institute of Medical Research [Clayton], Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC, Australia, Faculty of Information Technology, Monash University [Clayton], Murdoch Children's Research Institute (MCRI), Department of Physiology, School of Biomedical Sciences, The University of Melbourne, Parkville, VIC, Australia, This work was supported by an Australian Research Council Discovery Project grant (DP1049980), a National Health and Medical Research Council/Heart Foundation Career Development Fellowship (1049980), Sun Foundation to MR and UROP scholarships to LTD, MHHC, MJD and AT. The Australian Regenerative Medicine Institute is supported by grants from the State Government of Victoria and the Australian Government. This research was supported by use of the Nectar Research Cloud, a collaborative Australian research platform supported by the National Collaborative Research Infrastructure Strategy (NCRIS)., Dang, Louis T, Tondl, Markus, Chiu, Man Ho H, Revote, Jerico, Ramialison, Mirana, University of California [Santa Cruz] (UC Santa Cruz), University of California (UC)-University of California (UC), Université Nice Sophia Antipolis (1965 - 2019) (UNS), COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA), and Bodescot, Myriam
Background A strong focus of the post-genomic era is mining of the non-coding regulatory genome in order to unravel the function of regulatory elements that coordinate gene expression (Nat 489:57–74, 2012; Nat 507:462–70, 2014; Nat 507:455–61, 2014; Nat 518:317–30, 2015). Whole-genome approaches based on next-generation sequencing (NGS) have provided insight into the genomic location of regulatory elements throughout different cell types, organs and organisms. These technologies are now widespread and commonly used in laboratories from various fields of research. This highlights the need for fast and user-friendly software tools dedicated to extracting cis-regulatory information contained in these regulatory regions; for instance transcription factor binding site (TFBS) composition. Ideally, such tools should not require prior programming knowledge to ensure they are accessible for all users. Results We present TrawlerWeb, a web-based version of the Trawler_standalone tool (Nat Methods 4:563–5, 2007; Nat Protoc 5:323–34, 2010), to allow for the identification of enriched motifs in DNA sequences obtained from next-generation sequencing experiments in order to predict their TFBS composition. TrawlerWeb is designed for online queries with standard options common to web-based motif discovery tools. In addition, TrawlerWeb provides three unique new features: 1) TrawlerWeb allows the input of BED files directly generated from NGS experiments, 2) it automatically generates an input-matched biologically relevant background, and 3) it displays resulting conservation scores for each instance of the motif found in the input sequences, which assists the researcher in prioritising the motifs to validate experimentally. Finally, to date, this web-based version of Trawler_standalone remains the fastest online de novo motif discovery tool compared to other popular web-based software, while generating predictions with high accuracy. Conclusions TrawlerWeb provides users with a fast, simple and easy-to-use web interface for de novo motif discovery. This will assist in rapidly analysing NGS datasets that are now being routinely generated. TrawlerWeb is freely available and accessible at: http://trawler.erc.monash.edu.au. Electronic supplementary material The online version of this article (10.1186/s12864-018-4630-0) contains supplementary material, which is available to authorized users.