Czech language  English language

Publications & Reports - Document Abstract

Zdenek Kouba, Ondrej Tománek, Lukáš Kencl
Evaluation of Datacenter Network Topology Influence on Hadoop MapReduce Performance
Cloud Networking (Cloudnet), 2016 5th IEEE International Conference on
Pisa, Italy

Hadoop MapReduce has nowadays become the de-facto standard for the Big-Data processing within Cloud datacenters. However, little is known about the influence of datacenter network topology on Hadoop performance, and suitability of various topologies for different workload distributions. By extending a publicly available simulator CloudSim, we simulate six well-known or recently proposed topologies (Hierarchical, FatTree, DCell, CamCube, BCube, MapReduce) and evaluate Hadoop MapReduce performance across varyingly distributed workloads. We conclude that while no topology is clearly optimal, the experimental CamCube topology exhibits the most promising results. However, different topologies correspond to different workload divisions in terms of best performance, with greatly differing results, and generally weak performance under highly skewed workloads. This finding could lead to significant Hadoop MapReduce performance improvements by adjusting or selecting appropriate datacenter network topologies - potentially even at runtime, using Software Defined Networking.


R&D Centre (RDC) for Mobile Applications
Dept of Telecommunications Engineering
Faculty of Electrical Engineering
Czech Technical University in Prague
Technická 2, 166 27 Prague 6
Czech Republic

Tel.: (+420) 224 355 991
Fax.: (+420) 233 335 999