博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
系统设计 consistent hashing
阅读量:7080 次
发布时间:2019-06-28

本文共 4809 字,大约阅读时间需要 16 分钟。

System design interview- consistent hash

Tags: , , , , 

Consistent hashing is a simple yet powerful solution to a popular problem: how to locate a server in a distributed environment to store or get a value identified by a key, while handle server failures and network change?

A simple method is number the servers from 0 to N – 1. To store or retrieve a value V, we can use Hash(V) % N to get the id of the server. 

However, this method does not work well when 1) some servers failed in the network (e.g, N changes) or 2)  new machines are added to the server. The only solution is to rebuild the hash for all the servers, which is time and resource consuming  Given that failures are common with a network with thousands of machines, we need a better solution. 

Consistent hash

Consistent hash is the solution to the above described problem. In consistent hash,  the data and the servers are hashed to the same space. For instance, the same hash function is used to map a server and the data to a hashed key.  The hash space from 0 to N is treated as if it forms a circle- a hash ring.  Given a server with IPX, by applying the hash function, it is mapped to a position on the hash ring.  Then for a data value V, we again hash it using the same hash function to map it to some position on the hash ring. To locate the corresponding server we just move round the circle clockwise from this point until the closest server is located.  If there is no server found, a first server will be used. This is how to make the hash space as a ring. 

For example, in the following figure, A, B, C, D are servers hashed on the ring. K1 and K2 will be served by BK3 and K4will be served by C.

consistent hash figure

One problem of this method is that some servers may have disproportionately large hash space before them. This will cause these servers suffer greater load while the remainder servers might serve too few data.  A solution is to create a number of virtually replicated nodes at different positions on the hash ring for each server. For example, for server A with ip address IPA, we can map it to M positions on the hash ring using Hash(IPA + "0")Hash(IPA +"1"), …  Hash(IPA +"M - 1").

This can help the servers more evenly distributed on the hash ring.  Note that this is not related to server replication. Each virtual replicated position with Hash(IPA + "x") represents the same physical server with IP address IPA

In the following figure, server A has four virtual nodes on the ring.  We can see that the physical servers can handle data more proportionally. 

consistent hash virtual node

consistent hash virtual node

 

Implementation of Consistent hash

The following is the implementation of Consistent hash using Java. 

 

 

Here we use Java SortedMap to represent the hash ring. The keys for the sorted map are hashed values for servers or data points. The HashFunction is an interface defining the hash method to map data or servers to the hash ring. The numberOfReplications is used to control the number of virtual server nodes to be added to the ring. 

A few points to note:

  • The default hash function in java is not recommended as it is not mixed well. In practice, we need to provide our own hash function through the HashFunction interface. MD5 is recommended. 
  • For Server interface, I only define Name related methods here. In practice, you can add more methods to the Server interface. 

Reference:

转载于:https://www.cnblogs.com/jxr041100/p/8415915.html

你可能感兴趣的文章
ansible笔记(12):handlers的用法
查看>>
GPS文件处理
查看>>
Spring Boot 入门
查看>>
数据库excel导出
查看>>
MyBati__mapper 中取值(#{} 或${}) 以及 parameterType为(基本类型 或复杂类型)
查看>>
在Ubuntu上为Android系统内置Java应用程序测试Application Frameworks层的硬件服务
查看>>
docker的安装
查看>>
设计原则—依赖倒转原则
查看>>
让IE10等支持classList2.0(转)
查看>>
hausaufgabe--python 31 - Pickle
查看>>
流程控制--while
查看>>
max() min()
查看>>
python print 控制台输出中文
查看>>
[C++] Deep copy ,Shallow copy, copy constructor,"="
查看>>
render httprequest
查看>>
lsattr, chattr
查看>>
redis key设置过期时间
查看>>
0514JS基础:操作document对象、事件、this
查看>>
Gtest:源码解析
查看>>
【杂题总汇】HDU2018多校赛第九场 Rikka with Nash Equilibrium
查看>>