Back to Search
Start Over
A Nonrelational Data Warehouse for the Analysis of Field and Laboratory Data From Multiple Heterogeneous Photovoltaic Test Sites
- Source :
- IEEE Journal of Photovoltaics; January 2017, Vol. 7 Issue: 1 p230-236, 7p
- Publication Year :
- 2017
-
Abstract
- A nonrelational, distributed computing, data warehouse, and analytics environment (Energy-CRADLE) was developed for the analysis of field and laboratory data from multiple heterogeneous photovoltaic (PV) test sites. This data informatics and analytics infrastructure was designed to process diverse formats of PV performance data and climatic telemetry time-series data collected from a PV outdoor test network, i.e., the Solar Durability and Lifetime Extension global SunFarm network, as well as point-in-time laboratory spectral and image measurements of PV material samples. Using Hadoop/HBase for the distributed data warehouse, Energy-CRADLE does not have a predefined data table schema, which enables ingestion of data in diverse and changing formats. For easy data ingestion and data retrieval, Energy-CRADLE utilizes Hadoop streaming to enable Python MapReduce and provides a graphical user interface, i.e., py-CRADLE. By developing the Hadoop distributed computing platform and the HBase NoSQL database schema for solar energy, Energy-CRADLE exemplifies an integrated, scalable, secure, and user-friendly data informatics and analytics system for PV researchers. An example of Energy-CRADLE enabled scalable, data-driven, analytics is presented, where machine learning is used for anomaly detection across 2.2 million real-world current-voltage (I-V) curves of PV modules in three distinct KoĢppen-Geiger climatic zones.
Details
- Language :
- English
- ISSN :
- 21563381 and 21563403
- Volume :
- 7
- Issue :
- 1
- Database :
- Supplemental Index
- Journal :
- IEEE Journal of Photovoltaics
- Publication Type :
- Periodical
- Accession number :
- ejs40949138
- Full Text :
- https://doi.org/10.1109/JPHOTOV.2016.2626919