노드의 동적 다운 스케일링을 지원하는 분산 클러스터 시스템의 설계 및 구현

류우석

doi:10.13067/JKIECS.2023.18.2.361

빅데이터의 분산 처리를 수행하기 위한 대표적인 프레임워크인 하둡은 클러스터 규모를 수천 개 이상의 노드까지 증가시켜서 병렬분산 처리 성능을 높일 수 있는 장점이 있다. 하지만 클러스터의 규모를 줄이는 것은 결함이 있거나 성능이 저하된 노드들을 영구적으로 퇴역시키는 수준에서 제한되어 있음에 따라 소규모 클러스터에서 여러 노드들을 유연하게 운용하기에는 한계가 있다. 본 논문에서는 하둡 클러스터에서 노드를 제거할 때 발생하는 문제점을 논의하고 분산 클러스터의 규모를 탄력적으로 관리하기 위한 동적 다운 스케일링 기법을 제안한다. 일시적 다운스케일을 목적으로 노드를 제거할 때 완전히 퇴역시키는 것이 아니라 일시적으로 해제하고 필요시 다시 연결할 수 있도록 함으로써 동적 다운 스케일링을 지원할 수 있도록 시스템과 인터페이스를 설계하고 구현하였다. 실험 결과 성능저하 없이 효과적으로 다운 스케일링을 수행하는 것을 검증하였다.

Apache Hadoop, a representative framework for distributed processing of big data, has the advantage of increasing cluster size up to thousands of nodes to improve parallel distributed processing performance. However, reducing the size of the cluster is limited to the extent of permanently decommissioning nodes with defects or degraded performance, so there are limitations to operate multiple nodes flexibly in small clusters. In this paper, we discuss the problems that occur when removing nodes from the Hadoop cluster and propose a dynamic down-scaling technique to manage the distributed cluster more flexibly. To do this, we design and implement a modified Hadoop system and interfaces to support dynamic down-scaling of the cluster which supports temporary pause of a node and reconnection of it when necessary, rather than decommissioning the node when removing a node from the Hadoop cluster. We have verified that effective downsizing can be performed without performance degradation based on experimental results.

노드의 동적 다운 스케일링을 지원하는 분산 클러스터 시스템의 설계 및 구현
Design and Implementation of Distributed Cluster Supporting Dynamic Down-Scaling of the Cluster

(0)

(0)

(0)

(0)

노드의 동적 다운 스케일링을 지원하는 분산 클러스터 시스템의 설계 및 구현 Design and Implementation of Distributed Cluster Supporting Dynamic Down-Scaling of the Cluster

(0)

(0)

(0)

(0)

노드의 동적 다운 스케일링을 지원하는 분산 클러스터 시스템의 설계 및 구현
Design and Implementation of Distributed Cluster Supporting Dynamic Down-Scaling of the Cluster