Efficient Data Placement Research Articles

In-network caching is considered to be a vital part of the Internet for future applications (e.g., Internet of Things). One proposal that has attracted interest in recent years, Named Data Networking (NDN), aims to facilitate in-network caching by locating content by name. However, the efficiency of in-network caching has been questioned by experts. Data correlation among caches builds strong dependencies between caches at the edge and in the core. That dependency makes analyzing network performance difficult.This paper proposes CCndnS (Content Caching strategy for NDN with Skip), a caching policy to break the dependencies among caches, thus facilitating the design of an efficient data placement algorithm. Specifically, each cache – regardless of its location in the network – should receive an independent set of requests; otherwise, only misses from downstream caches make their way to the upstream caches, i.e. a filtering effect that induces a correlation among the caches.CCndnS breaks a file into smaller segments and spreads them in the path between requester and publisher in a way that the head of the file (the first segment) should be cached at the edge router close to the requester and the tail far from the requester and towards the content provider. Requests for a segment skip searching caches in its path, to search only the cache with the segment of interest. That reduces the number of futile checks on caches, and thus the delay from memory accesses. This mechanism also decouples the caches, so there is a simple analytical model for cache performance in the network.We illustrate an application of the model to enforce a Service Level Agreement (SLA) between a content provider and the caching system proposed in this paper. The model can be used for cache provisioning for two purposes: (1) To specify the cache size to be reserved for specific contents to reach some desired performance. For instance, if the client of an SLA requires a 50% cache hit for its content at each router, the model can be used to determine the cache size that needs to be reserved to reach the 50% hit rate. (2) To calculate the effect of such reservations on other contents that use the routers covered by the SLA.The design, analysis, and application are tested with extensive simulations.

Read full abstract

We are happy to present this special issue of the scientific journal Scalable Computing: Practice and Experience. In this special issue on Infrastructures and Algorithms for Scalable Computing (Volume 19, No 3 June 2018), we have selected four papers out of submitted nine, which gone through a peer review according to the journal policy. All papers represent novel results in the fields of distributed algorithms and infrastructures for scalable computing. The first paper presents present a novel approach for efficient data placement, which improves the performance of workflow execution in distributed datacenters. The greedy heuristic algorithm, which is based on a network flow optimization framework, minimizes the total storage cost, including efforts to move and store the data from different source locations and dependencies. The second paper evaluated the significance of different clustering techniques viz. k-means, Hierarchical Agglomerative Clustering and Markov Clustering in groupingawaredata placement for data-intensive applications with interest locality. The evaluation in Azure reported that Markov Clustering-based data placement strategy improves the local map execution and reduces the execution time compared to Hadoops Default Data Placement Strategy and other evaluated clustering techniques. This is more emphasized for data-intensive applications that have interest locality. The third paper presents an experimental evaluation of the openMP thread-mapping strategies in different hardware environments (IntelXeon Phi coprocessor and hybrid CPU-MIC platforms). The paper shows the optimal choice of thread affinity, the number of threads and the execution mode that can provide optimal performance of the LU factorization. In the fourth paper, the authors study the amount of memory occupied by sparse matrices split up into same-size blocks. The paper considers and statistically evaluates four popular storage formats and combinations among them. The conclusion is that block-based storage formats may significantly reduce memory footprints of sparse matrices arising from a wide range of application domains. We use this opportunity to thank all contributors to this Special Issue: all authors who submitted the results of their latest research and all reviewers for their valuable comments and suggestions for improvement. We would like to express our special gratitude for the Editor-in-Chief, Professor Dana Petcu, for her constant support during the whole process of this Special Issue.

Read full abstract

Efficient Data Placement Research Articles

Related Topics

Articles published on Efficient Data Placement

A Learning-based Data Placement Framework for Low Latency in Data Center Networks

Multi-Task Learning for Electricity Price Forecasting and Resource Management in Cloud Based Industrial IoT Systems

Enhancing Scalability through Extended Hybrid Switching with Mixture Path Deployment (EHS-MPD): A Novel Framework

Unified Holistic Memory Management Supporting Multiple Big Data Processing Frameworks over Hybrid Memories

Efficient XML data placement schemes over multiple mobile wireless broadcast channels

A Novel Data Placement and Retrieval Service for Cooperative Edge Clouds

Counterintuitive Characteristics of Optimal Distributed LRU Caching Over Unreliable Channels

Fuzzy Theory-Based Data Placement for Scientific Workflows in Hybrid Cloud Environments

Artificial Intelligence for Wireless Caching: Schemes, Performance, and Challenges

Decoupling NDN caches via CCndnS: Design, analysis, and application

Electricity Price Forecasting for Cloud Computing Using an Enhanced Machine Learning Model

Graph-based model and algorithm for minimising big data movement in a cloud environment

Special Issue on Infrastructures and Algorithms for Scalable Computing

Exact and Heuristic Data Workflow Placement Algorithms for Big Data Computing in Cloud Datacenters

Graph-based model and algorithm for minimising big data movement in a cloud environment

Cloudkit

Byzantine fault-tolerant and semantic-driven consensus protocol

Efficient location-aware data placement for data-intensive applications in geo-distributed scientific data centers

Analyzing Enterprise Storage Workloads With Graph Modeling and Clustering

Exploiting CMS data popularity to model the evolution of data management for Run-2 and beyond

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Efficient Data Placement Research Articles

Related Topics

Articles published on Efficient Data Placement

A Learning-based Data Placement Framework for Low Latency in Data Center Networks

Multi-Task Learning for Electricity Price Forecasting and Resource Management in Cloud Based Industrial IoT Systems

Enhancing Scalability through Extended Hybrid Switching with Mixture Path Deployment (EHS-MPD): A Novel Framework

Unified Holistic Memory Management Supporting Multiple Big Data Processing Frameworks over Hybrid Memories

Efficient XML data placement schemes over multiple mobile wireless broadcast channels

A Novel Data Placement and Retrieval Service for Cooperative Edge Clouds

Counterintuitive Characteristics of Optimal Distributed LRU Caching Over Unreliable Channels

Fuzzy Theory-Based Data Placement for Scientific Workflows in Hybrid Cloud Environments

Artificial Intelligence for Wireless Caching: Schemes, Performance, and Challenges

Decoupling NDN caches via CCndnS: Design, analysis, and application

Electricity Price Forecasting for Cloud Computing Using an Enhanced Machine Learning Model

Graph-based model and algorithm for minimising big data movement in a cloud environment

Special Issue on Infrastructures and Algorithms for Scalable Computing

Exact and Heuristic Data Workflow Placement Algorithms for Big Data Computing in Cloud Datacenters

Graph-based model and algorithm for minimising big data movement in a cloud environment

Cloudkit

Byzantine fault-tolerant and semantic-driven consensus protocol

Efficient location-aware data placement for data-intensive applications in geo-distributed scientific data centers

Analyzing Enterprise Storage Workloads With Graph Modeling and Clustering

Exploiting CMS data popularity to model the evolution of data management for Run-2 and beyond