Journal article
Risk-aware intermediate dataset backup strategy in cloud-based data intensive workflows
Future Generation Computer Systems, Vol.55, pp.524-533
2016
Abstract
Data-intensive workflows are generally computing- and data-intensive with large volume of data generated during their execution. Therefore, some of the data should be saved to avoid the expensive re-execution of tasks in case of exceptions. However, cloud-based data storage services come at some expense. In this paper, we introduce the risk evaluation model tailored for workflow structure to measure and achieve the trade-off between the overhead of backup storage and the cost of data regeneration in failure, making the service selection and execution more efficient and robust. The proposed method computes and compares the potential loss with and without data backup to achieve the trade-off between overhead of intermediate dataset backup and task re-execution after exceptions. We also design the utility function with the model and apply a genetic algorithm to find the optimized schedule. The results show that the robustness of the schedule is increased while the possible risk of failure is minimized, especially when the volume of generated data is not large in comparison with the input.
Details
- Title
- Risk-aware intermediate dataset backup strategy in cloud-based data intensive workflows
- Authors
- Mingzhong Wang (Author) - Beijing Institute of Technology, ChinaLiehuang Zhu (Author) - Beijing Institute of Technology, ChinaZijian Zhang (Author) - Beijing Institute of Technology, China
- Publication details
- Future Generation Computer Systems, Vol.55, pp.524-533
- Publisher
- Elsevier BV, North-Holland
- Date published
- 2016
- DOI
- 10.1016/j.future.2014.08.009
- ISSN
- 0167-739X
- Organisation Unit
- University of the Sunshine Coast, Queensland; USC Business School - Legacy; School of Science, Technology and Engineering
- Language
- English
- Record Identifier
- 99449382902621
- Output Type
- Journal article
Metrics
4 File views/ downloads
1068 Record Views
InCites Highlights
These are selected metrics from InCites Benchmarking & Analytics tool, related to this output
- Web Of Science research areas
- Computer Science, Theory & Methods
UN Sustainable Development Goals (SDGs)
This output has contributed to the advancement of the following goals:
Source: InCites