Back to Search Start Over

HOCOMOCO in 2024: a rebuild of the curated collection of binding models for human and mouse transcription factors.

Authors :
Vorontsov IE
Eliseeva IA
Zinkevich A
Nikonov M
Abramov S
Boytsov A
Kamenets V
Kasianova A
Kolmykov S
Yevshin IS
Favorov A
Medvedeva YA
Jolma A
Kolpakov F
Makeev VJ
Kulakovskiy IV
Source :
Nucleic acids research [Nucleic Acids Res] 2024 Jan 05; Vol. 52 (D1), pp. D154-D163.
Publication Year :
2024

Abstract

We present a major update of the HOCOMOCO collection that provides DNA binding specificity patterns of 949 human transcription factors and 720 mouse orthologs. To make this release, we performed motif discovery in peak sets that originated from 14 183 ChIP-Seq experiments and reads from 2554 HT-SELEX experiments yielding more than 400 thousand candidate motifs. The candidate motifs were annotated according to their similarity to known motifs and the hierarchy of DNA-binding domains of the respective transcription factors. Next, the motifs underwent human expert curation to stratify distinct motif subtypes and remove non-informative patterns and common artifacts. Finally, the curated subset of 100 thousand motifs was supplied to the automated benchmarking to select the best-performing motifs for each transcription factor. The resulting HOCOMOCO v12 core collection contains 1443 verified position weight matrices, including distinct subtypes of DNA binding motifs for particular transcription factors. In addition to the core collection, HOCOMOCO v12 provides motif sets optimized for the recognition of binding sites in vivo and in vitro, and for annotation of regulatory sequence variants. HOCOMOCO is available at https://hocomoco12.autosome.org and https://hocomoco.autosome.org.<br /> (© The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research.)

Details

Language :
English
ISSN :
1362-4962
Volume :
52
Issue :
D1
Database :
MEDLINE
Journal :
Nucleic acids research
Publication Type :
Academic Journal
Accession number :
37971293
Full Text :
https://doi.org/10.1093/nar/gkad1077