1. CACSE: Context Aware Clustering of Stellar Evolution
- Author
-
Xu Teng, Adam Corpstein, Konstantinos Kovlakas, Goce Trajcevski, Scott Coughlin, Jeff J. Andrews, Tassos Fragos, Prabin Giri, Y. Qin, Aaron Dotter, Joel Holm, E. Zapartas, Ethan Vander Wiel, Juan G. Serra-Perez, Devina Misra, Nam Tran, Jaime Roman-Garja, Becker Mathie, Simone S. Bavera, Philip Payne, and Willis Knox
- Subjects
Set (abstract data type) ,DBSCAN ,Computer science ,Middleware (distributed applications) ,Context (language use) ,Data mining ,Interval (mathematics) ,User interface ,Python (programming language) ,computer.software_genre ,Cluster analysis ,computer ,computer.programming_language - Abstract
We present CACSE – a system for Context Aware Clustering of Stellar Evolution – for datasets corresponding to temporal evolution of stars, which are multivariate time series, usually with a large number of attributes (e.g., ≥ 40). Typically, the datasets are obtained by simulation and are relatively large in size (5 ∼ 10 GB per certain interval of values for various initial conditions). Investigating common evolutionary trends in these datasets often depends on the context – i.e., not all the attributes are always of interest, and among the subset of the context-relevant attributes, some may have more impact than others. To enable such context-aware clustering, our CACSE system provides functionalities allowing the domain experts to dynamically select attributes that matter, and assign desired weights/priorities. Our system consists of a PostgreSQL database, Python-based middleware with RESTful and Django framework, and a web-based user interface as frontend. The user interface provides multiple interactive options, including selection of datasets and preferred attributes along with the corresponding weights. Subsequently, the users can select a time instant or a time range to visualize the formed clusters. Thus, CACSE enables a detection of changes in the the set of clusters (i.e., convoys) of stellar evolution tracks. Current version provides two of the most popular clustering algorithms – k-means and DBSCAN.
- Published
- 2021