谈谈dubbo集群容错
从继承图可以看出有以下几种策略:
- Failover Cluster(失败转移) 当调用提供者的服务器发生错误时,再试下一个服务器。 用于读操作。重试有延时。设置重试次数
具体代码实现
public Result doInvoke(Invocation invocation, final List<Invoker<T>> invokers, LoadBalance loadbalance) {
List<Invoker<T>> invoked = new ArrayList<Invoker<T>>(copyinvokers.size());
Set<String> providers = new HashSet<String>(len);
for (int i = 0; i < len; i++) {//5
if (i > 0) {
checkWhetherDestroyed();
copyinvokers = list(invocation);
checkInvokers(copyinvokers, invocation);
}
Invoker<T> invoker = select(loadbalance, invocation, copyinvokers, invoked);//1
invoked.add(invoker);//2
RpcContext.getContext().setInvokers((List) invoked);
try {
Result result = invoker.invoke(invocation);//3
if (le != null && logger.isWarnEnabled()) {logger.warn
return result;
} catch (RpcException e) {//4
if (e.isBiz()) { throw e;
le = e;
} catch (Throwable e) {
le = new RpcException(e.getMessage(), e);
} finally
providers.add(invoker.getUrl().getAddress());
}
}//for-end
throw new RpcException(le.getCode()//6
}
复制代码
实现思想:利用for循环,成功直接return。这样就可以多次调用。
每一次调用先找一个服务器,加入到已调用结合。开始调用。拿到结果。如果出错,catch住,分析异常类型。当超出规定次数,直接抛异常。
Failfast Cluster (快速失败) 只发起一次调用,失败立即报错。一般对应于非幂等性的写操作。比如新增。
public Result doInvoke(Invocation invocation, List
> invokers, LoadBalance loadbalance) throws RpcException { checkInvokers(invokers, invocation);
Invoker<T> invoker = select(loadbalance, invocation, invokers, null);
try {
return invoker.invoke(invocation);
} catch (Throwable e) {
if (e instanceof RpcException && ((RpcException) e).isBiz()) { // biz exception.
throw (RpcException) e;
}
throw new RpcException(e instanceof RpcException ? ((RpcException) e).getCode() : 0,
}
}
复制代码
实现思想:直接调用。处理异常。简单
Failback Cluster (失败自动恢复) 失败自动恢复,后台记录失败请求,定时重发。比如消息通知
protected Result doInvoke(Invocation invocation, List
> invokers, LoadBalance loadbalance) throws RpcException { try {
checkInvokers(invokers, invocation);
Invoker<T> invoker = select(loadbalance, invocation, invokers, null);
return invoker.invoke(invocation);
} catch (Throwable e) {
logger.error(
addFailed(invocation, this);
return new RpcResult(); // ignore
}
}
private void addFailed(Invocation invocation, AbstractClusterInvoker<?> router) {
if (retryFuture == null) {
synchronized (this) {
if (retryFuture == null) {
retryFuture = scheduledExecutorService.scheduleWithFixedDelay(new Runnable() {
@Override
public void run() {
// collect retry statistics
try {
retryFailed();
} catch (Throwable t) { // Defensive fault tolerance
logger.error("Unexpected error occur at collect statistic", t);
}
}
}, RETRY_FAILED_PERIOD, RETRY_FAILED_PERIOD, TimeUnit.MILLISECONDS);
}
}
}
failed.put(invocation, router);
}
复制代码
实现思想:先直接调用。失败后,放到map直接保存。然后用定时线程池去重试map里的信息。
FailSafe Cluster (失败安全) 失败了,直接忽略。比如写入日志啥的
public Result doInvoke(Invocation invocation, List
> invokers, LoadBalance loadbalance) throws RpcException { try {
checkInvokers(invokers, invocation);
Invoker<T> invoker = select(loadbalance, invocation, invokers, null);
return invoker.invoke(invocation);
} catch (Throwable e) {
logger.error("Failsafe ignore exception: " + e.getMessage(), e);
return new RpcResult(); // ignore
}
}
复制代码
Forking Cluster (并行调用) 同时调用多个服务器,有一个成功,就返回。用于实时性较高的情况
public Result doInvoke(final Invocation invocation, List
> invokers, LoadBalance loadbalance) throws RpcException { try {
checkInvokers(invokers, invocation);
final List<Invoker<T>> selected;
final int forks = getUrl().getParameter(Constants.FORKS_KEY, Constants.DEFAULT_FORKS);
final int timeout = getUrl().getParameter(Constants.TIMEOUT_KEY, Constants.DEFAULT_TIMEOUT);
if (forks <= 0 || forks >= invokers.size()) {
selected = invokers;
} else {
selected = new ArrayList<>();
for (int i = 0; i < forks; i++) {
// TODO. Add some comment here, refer chinese version for more details.
Invoker<T> invoker = select(loadbalance, invocation, invokers, selected);
if (!selected.contains(invoker)) {
//Avoid add the same invoker several times.
selected.add(invoker);
}
}
}
RpcContext.getContext().setInvokers((List) selected);
final AtomicInteger count = new AtomicInteger();
final BlockingQueue<Object> ref = new LinkedBlockingQueue<>();
for (final Invoker<T> invoker : selected) {
executor.execute(new Runnable() {
@Override
public void run() {
try {
Result result = invoker.invoke(invocation);
ref.offer(result);
} catch (Throwable e) {
int value = count.incrementAndGet();
if (value >= selected.size()) {
ref.offer(e);
}
}
}
});
}
try {
Object ret = ref.poll(timeout, TimeUnit.MILLISECONDS);
if (ret instanceof Throwable) {
Throwable e = (Throwable) ret;
throw new RpcException(e instanceof RpcException ? ((RpcException) e).getCode() : 0, "Failed to forking invoke provider " + selected + ", but no luck to perform the invocation. Last error is: " + e.getMessage(), e.getCause() != null ? e.getCause() : e);
}
return (Result) ret;
} catch (InterruptedException e) {
throw new RpcException("Failed to forking invoke provider " + selected + ", but no luck to perform the invocation. Last error is: " + e.getMessage(), e);
}
} finally {
// clear attachments which is binding to current thread.
RpcContext.getContext().clearAttachments();
}
}
复制代码
实现思想:利用list保存多个服务器。用线程池同时提交调用。将结果放到LinkedBlockingQueue。然后从队列里取结果。
Broadcast Cluster (广播调用) 广播调用所有提供者,琢一调用。任意一台报错,则报错。用于通知服务提供者更新资源啥的。
public Result doInvoke(final Invocation invocation, List
> invokers, LoadBalance loadbalance) throws RpcException { checkInvokers(invokers, invocation);
RpcContext.getContext().setInvokers((List) invokers);
RpcException exception = null;
Result result = null;
for (Invoker<T> invoker : invokers) {
try {
result = invoker.invoke(invocation);
} catch (RpcException e) {
exception = e;
logger.warn(e.getMessage(), e);
} catch (Throwable e) {
exception = new RpcException(e.getMessage(), e);
logger.warn(e.getMessage(), e);
}
}
if (exception != null) {
throw exception;
}
return result;
}
复制代码
实现思想:for循环挨个调用。
转载于//juejin.im/post/5c04ea1051882508851b7c53
还没有评论,来说两句吧...