Back to Search
Start Over
CLOUD BASED DISTRIBUTED COMPUTING PLATFORM FOR MULTIMODAL ENERGY DATA STREAMS
- Publication Year :
- 2014
-
Abstract
- CLOUD BASED DISTRIBUTED COMPUTING PLATFORM FOR MULTIMODAL ENERGY DATA STREAMSAbstractbyVENKAT YASHWANTH GUNAPATISolar energy is a vital component of renewable resources and one of the most desirable energy sources in industrial and domestic ventures. Recent studies indicates lack of sufficient qualification testing on PV modules. Developing a real-world test environment would help in better understanding of degradation mechanisms. This can be achieved by analyzing solar irradiance, rain, weather etc from various PV power systems present across the world. Energy Common Research Analytics and Data Lifecycle Environment (CRADLE), a DOE- funded Bay Area PV consortium project is a real-time distributed, data management system which executes daily data capture from world-wide PV systems and processes the raw data into a distributed data-store, which provides data access through an ontology driven web interface. It provides a robust and scalable systems to tackle the big data challenges thrown by data generated on minute by minute basis from PV systems. This thesis describes the informatics component of Energy CRADLE, responsible for acquisition, processing and projection of around 120 GB of multimodal stream data produced by PV systems per year. It uses a real-time Hadoop Mapreduce infrastructure to process the stream data into HBase, which is a distributed non-relational database. It integrates an ontology guided query interface, which incorporates transaction processing unit for the distributed data-store. The whole system is built as a service on VMware private cloud architecture VSphere, which provides a scalable and secure platform for distributed computing. The Energy CRADLE interface allows researchers to query required data sets into various research platforms and helps in filtering and cross comparing data across various data sources. It also provides user interface to monitor day to day data processing and storage. This systems is currently being implemented in SDLE (Solar Durability and Lifetime Extension) lab since november, 2013 and we processed around 70GB of data for selected datasources, preliminary results show that the data processing power has been significantly improved compared to processing to the traditional RDBMS, without sacrifice in interface usability. Even with increasing data sources, Energy CRADLE’s distributed architecture will maintain and process data with high performance.
- Subjects :
- Computer Science
Subjects
Details
- Language :
- English
- Database :
- OpenDissertations
- Publication Type :
- Dissertation/ Thesis
- Accession number :
- ddu.oai.etd.ohiolink.edu.case1399373847