Nacos2.X源码阅读有哪些关键点总结?

2026-05-22 16:001阅读0评论SEO基础
  • 内容介绍
  • 文章标签
  • 相关推荐

本文共计4752个文字,预计阅读时间需要20分钟。

Nacos2.X源码阅读有哪些关键点总结?

前言+Nacos是阿里巴巴出品的一个高性能微服务时代产物,其核心在于集中注册中心和配置中心。那么Nacos为什么这么高性能呢?总结以下几点;

1:基于阿里自研的distro协议进行数据传输,提高了数据同步的效率;

2:采用高效的数据结构,如跳表,优化了查找和存储性能;

3:采用异步处理机制,减少了同步调用带来的性能损耗;

4:支持集群部署,提高了系统的可用性和伸缩性。

前言

Nacos是一个Alibaba出品的高性能微服务时代产出的组件,集注册和配置中心为一体。那么Nacos为什么这么高性能呢?总结以下几点;

1:基于阿里自研的distro协议进行Nacos把不同节点的数据同步

2:大量使用线程池和异步的方式提高API的响应速度

3:2.X使用grpc长连接的方式取代了1.X需要一直发送心跳包导出服务器CPU占用较高的问题

同时2.X也对1.X做了重大的升级,无论是从架构层面还是代码层面都做了重大的升级,有条件升级为2.X的同学建议客户端可服务端一起升级,这样才能更大限度的发挥出2.X架构的优势。1.X和2.X的对比 如下:

1.X 2.X 连接方式 Http短连接 GRpc、Http短连接(兼容1.X) 推送方式 UDP GRpc 健康检测方式 Http短连接定时心跳包 Grpc长连接(轻量级心跳包)

关于Nacos1.X和2.X的性能对比请参考:Nacos 2.0 升级前后性能对比压测-阿里云开发者社区 (aliyun.com)

这里借用一下阿里云社区的Nacos架构图:

下面我们就基于Nacos2.0.4的代码层面分析一下为什么Nacos源码,看之前最好有以下基础,设计模式(模板,委托,代理,单例,工厂,策略)、异步编程方式,grpc。

启动

首先我们先看一下Nacos的结构图:Nacos通过Namespace(命名空间)进行环境的隔离,然后我们可以把根据服务之间的关联性来把不同的服务划分到不同的组(Group)之间,每一个组之间可以有多个服务(Service),同时为了容灾,我们可以把一个服务划分为不同的集群(Cluster)部署在不同的地区或机房,每一个具体的集群下就是我们一个个实例(Instance)了,也就是我们开发的微服务项目。

由于Nacos 中很多都是用异步方式来处理的,所以我们很多的时候不能直接采用流程的方式来阅读代码,阅读的时候会来回的跳转,异步事件的方式编程相对来说复杂了很多,首先我们先看一下Nacos的启动过程,后续我贴代码的时候只贴关键代码,其他就略去了,后续不在重复。

下面看一下处理请求的事件和监听的逻辑

该类com.alibaba.nacos.core.remote.RequestHandlerRegistry监听了ContextRefreshedEvent事件,那么SpringBoot启动之后就是自动执行我们需要处理的逻辑。

/** * RequestHandlerRegistry. * 当Spring初始化完成之后,加载com.alibaba.nacos.core.remote.RequestHandler,注册为事件监听器 * * @author liuzunfei * @version $Id: RequestHandlerRegistry.java, v 0.1 2020年07月13日 8:24 PM liuzunfei Exp $ */ @Service public class RequestHandlerRegistry implements ApplicationListener<ContextRefreshedEvent> { /** * 请求处理器集合 */ Map<String, RequestHandler> registryHandlers = new HashMap<String, RequestHandler>(); @Autowired private TpsMonitorManager tpsMonitorManager; /** * Get Request Handler By request Type. * * @param requestType see definitions of sub constants classes of RequestTypeConstants * @return request handler. */ public RequestHandler getByRequestType(String requestType) { return registryHandlers.get(requestType); } /** * 此监听器的主要作用就是加载com.alibaba.nacos.core.remote.RequestHandler的子类到registryHandlers, * 后续做请求处理使用,可以看做是策略模式的一个体现 * * @param event event */ @Override public void onApplicationEvent(ContextRefreshedEvent event) { Map<String, RequestHandler> beansOfType = event.getApplicationContext().getBeansOfType(RequestHandler.class); Collection<RequestHandler> values = beansOfType.values(); for (RequestHandler requestHandler : values) { Class<?> clazz = requestHandler.getClass(); boolean skip = false; while (!clazz.getSuperclass().equals(RequestHandler.class)) { if (clazz.getSuperclass().equals(Object.class)) { skip = true; break; } clazz = clazz.getSuperclass(); } if (skip) { continue; } try { Method method = clazz.getMethod("handle", Request.class, RequestMeta.class); //需要TPS监控的类加入到tpsMonitorManager集合中 if (method.isAnnotationPresent(TpsControl.class) && TpsControlConfig.isTpsControlEnabled()) { TpsControl tpsControl = method.getAnnotation(TpsControl.class); String pointName = tpsControl.pointName(); TpsMonitorPoint tpsMonitorPoint = new TpsMonitorPoint(pointName); tpsMonitorManager.registerTpsControlPoint(tpsMonitorPoint); } } catch (Exception e) { //ignore. } Class tClass = (Class) ((ParameterizedType) clazz.getGenericSuperclass()).getActualTypeArguments()[0]; //添加处理器到集合中 registryHandlers.putIfAbsent(tClass.getSimpleName(), requestHandler); } } }

我们可以看到com.alibaba.nacos.core.remote.RequestHandler的实现类有好多,我们大致可以从名称就可以看出每一类的作用

com.alibaba.nacos.core.remote.RequestHandler子类名称 作用 com.alibaba.nacos.config.server.remote.ConfigChangeBatchListenRequestHandler 节点之间配置互相同步的处理器 com.alibaba.nacos.config.server.remote.ConfigChangeBatchListenRequestHandler 配置改变监听处理器 com.alibaba.nacos.config.server.remote.ConfigPublishRequestHandler 配置发布监听处理器 com.alibaba.nacos.config.server.remote.ConfigQueryRequestHandler 配置查询请求处理器 com.alibaba.nacos.config.server.remote.ConfigRemoveRequestHandler 配置移除请求处理器 com.alibaba.nacos.naming.remote.rpc.handler.DistroDataRequestHandler distro一致性服务处理器(节点点同步数据) com.alibaba.nacos.core.remote.HealthCheckRequestHandler 健康检查处理器 com.alibaba.nacos.naming.remote.rpc.handler.InstanceRequestHandler 实例注册,移除处理器 com.alibaba.nacos.core.remote.core.ServerLoaderInfoRequestHandler 服务信息加载处理器 com.alibaba.nacos.naming.remote.rpc.handler.ServiceListRequestHandler 服务列表请求处理器 com.alibaba.nacos.naming.remote.rpc.handler.ServiceQueryRequestHandler 服务查询处理器 Service的注册流程

io.grpc.stub.StreamObserver#onNext中启动了一个Acceptor,用来监听客户端的GRpc连接,当有客户端连接的时候,就会通过connectionManager.register(connectionId, connection)注册实例,然后会通过客户端注册连接器发布连接事件clientConnectionEventListenerRegistry.notifyClientConnected(connection);然后就会由监听事件实现具体的建立连接的逻辑,建立完成之后才会进行注册逻辑的执行。

com.alibaba.nacos.core.remote.ClientConnectionEventListenerRegistry:客户端连接Naocs事件注册器 目前已知的注册器都继承了com.alibaba.nacos.core.remote.ClientConnectionEventListener //代码目前为空,可能是为以后扩展使用 com.alibaba.nacos.config.server.remote.ConfigConnectionEventListener //用来管理客户端的连接,可以进行连接,断开连接,验证连接是否有效等操作,其内部有一个线程池定时清除无效的连接 com.alibaba.nacos.naming.core.v2.client.manager.impl.ConnectionBasedClientManager //grpc回调初始化以及清理监听器 com.alibaba.nacos.core.remote.core.RpcAckCallbackInitorOrCleaner

下面我们就使用com.alibaba.nacos.naming.remote.rpc.handler.InstanceRequestHandler服务注册请求分析一下该类的处理流程。

//定义请求的处理流程 public abstract class RequestHandler<T extends Request, S extends Response> { @Autowired private RequestFilters requestFilters; public Response handleRequest(T request, RequestMeta meta) throws NacosException { for (AbstractRequestFilter filter : requestFilters.filters) { try { Response filterResult = filter.filter(request, meta, this.getClass()); if (filterResult != null && !filterResult.isSuccess()) { return filterResult; } } catch (Throwable throwable) { Loggers.REMOTE.error("filter error", throwable); } } return handle(request, meta); } public abstract S handle(T request, RequestMeta meta) throws NacosException; }

@Component public class InstanceRequestHandler extends RequestHandler<InstanceRequest, InstanceResponse> { private final EphemeralClientOperationServiceImpl clientOperationService; public InstanceRequestHandler(EphemeralClientOperationServiceImpl clientOperationService) { this.clientOperationService = clientOperationService; } @Override @Secured(action = ActionTypes.WRITE, parser = NamingResourceParser.class) public InstanceResponse handle(InstanceRequest request, RequestMeta meta) throws NacosException { Service service = Service .newService(request.getNamespace(), request.getGroupName(), request.getServiceName(), true); switch (request.getType()) { case NamingRemoteConstants.REGISTER_INSTANCE: return registerInstance(service, request, meta); case NamingRemoteConstants.DE_REGISTER_INSTANCE: return deregisterInstance(service, request, meta); default: throw new NacosException(NacosException.INVALID_PARAM, String.format("Unsupported request type %s", request.getType())); } } /** * 委托给com.alibaba.nacos.naming.remote.rpc.handler.InstanceRequestHandler#clientOperationService来进行实例注册 * * @param service service * @param request request * @param meta meta * @return com.alibaba.nacos.api.naming.remote.response.InstanceResponse */ private InstanceResponse registerInstance(Service service, InstanceRequest request, RequestMeta meta) { clientOperationService.registerInstance(service, request.getInstance(), meta.getConnectionId()); return new InstanceResponse(NamingRemoteConstants.REGISTER_INSTANCE); } private InstanceResponse deregisterInstance(Service service, InstanceRequest request, RequestMeta meta) { clientOperationService.deregisterInstance(service, request.getInstance(), meta.getConnectionId()); return new InstanceResponse(NamingRemoteConstants.DE_REGISTER_INSTANCE); } }

可以看到InstanceRequestHandler继承了RequestHandler,父类在handleRequest定义好了请求的处理流程,最后具体的处理逻辑交给子类去实现,这就是一个典型模板设计模式的实现,可以看到子类根据request.getType()又把具体的处理成分为了注册实例和取消注册实例,然后又委托给了com.alibaba.nacos.naming.core.v2.service.impl.EphemeralClientOperationServiceImpl去处理具体的注册实例和取消注册实例的逻辑。

我们都知道Nacos的实例分为了Ephemeral和Persistent两种实例,而默认的都是Ephemeral,这里直接注册为EphemeralClientOperationServiceImpl的Bean而不是采用ClientOperationServiceProxy代理的方式,因为是Persistent的实例是的处理逻辑不在这里。

走了这么多步骤,终于到了注册实例的真正流程了

/** * 注册实例 * * @param service service * @param instance instance * @param clientId connectionId */ @Override public void registerInstance(Service service, Instance instance, String clientId) { //获取服务,如果如果已存在的话,替换掉旧的Service(namespace,group,name) Service singleton = ServiceManager.getInstance().getSingleton(service); if (!singleton.isEphemeral()) { throw new NacosRuntimeException(NacosException.INVALID_PARAM, String.format("Current service %s is persistent service, can't register ephemeral instance.", singleton.getGroupedServiceName())); } //获取client Client client = clientManager.getClient(clientId); if (!clientIsLegal(client, clientId)) { return; } //创建一个实例 InstancePublishInfo instanceInfo = getPublishInfo(instance); //把Service和instanceInfo缓存到连接的客户端里面,然后发布客户端变更事件 client.addServiceInstance(singleton, instanceInfo); //更新最后更新时间 client.setLastUpdatedTime(); //发布注册服务事件 NotifyCenter.publishEvent(new ClientOperationEvent.ClientRegisterServiceEvent(singleton, clientId)); //发布元数据更新事件(matadataId=>ip:port:clusterName) NotifyCenter .publishEvent(new MetadataEvent.InstanceMetadataEvent(singleton, instanceInfo.getMetadataId(), false)); }

可以看到注册流程里面分别有获取并替换旧服务,如果不存在的话就创建一个新的,然后根据ClientId获取对应的Client,然后根据创建一个InstanceInfo,添加Service和InstanceInfo到ClientManager里面,最后发布了两个事件。然后呢?这就完了?数据存到哪里了?注册哪里去了?刚开始,我也是带着这一系列的疑问,不知道数据存到哪里去了,后面通过根据控制台界面请求的接口/nacos/v1/ns/catalog/services发现该接口的数据都是从一个叫做ServiceStorage的里面读过来的,然后通过答案找问题的思路找到了在发布事件之后的一系列操作之后存在执行引擎中进行了数据存储操作。

Nacos2.X源码阅读有哪些关键点总结?

Nacos数据存储 ServiceStorage.java

/** * Service storage. * * @author xiweng.yy */ @Component public class ServiceStorage { /** * 客户单连接注册服务索引关注 */ private final ClientServiceIndexesManager serviceIndexesManager; private final ClientManager clientManager; private final SwitchDomain switchDomain; private final NamingMetadataManager metadataManager; /** * 服务信息 */ private final ConcurrentMap<Service, ServiceInfo> serviceDataIndexes; /** * 集群索引管理 key:value=>Service:Set(ClusterName) */ private final ConcurrentMap<Service, Set<String>> serviceClusterIndex; public ServiceStorage(ClientServiceIndexesManager serviceIndexesManager, ClientManagerDelegate clientManager, SwitchDomain switchDomain, NamingMetadataManager metadataManager) { this.serviceIndexesManager = serviceIndexesManager; this.clientManager = clientManager; this.switchDomain = switchDomain; this.metadataManager = metadataManager; this.serviceDataIndexes = new ConcurrentHashMap<>(); this.serviceClusterIndex = new ConcurrentHashMap<>(); } /** * 获取当前服务下的集群信息 * * @param service service * @return java.util.Set */ public Set<String> getClusters(Service service) { return serviceClusterIndex.getOrDefault(service, new HashSet<>()); } /** * 获取服务的数据信息 * * @param service service * @return com.alibaba.nacos.api.naming.pojo.ServiceInfo */ public ServiceInfo getData(Service service) { return serviceDataIndexes.containsKey(service) ? serviceDataIndexes.get(service) : getPushData(service); } /** * 若com.alibaba.nacos.naming.core.v2.ServiceManager不存在,则直接返回,已存在的话更新当前Service下的Cluster和Instance信息 * * @param service service * @return com.alibaba.nacos.api.naming.pojo.ServiceInfo */ public ServiceInfo getPushData(Service service) { ServiceInfo result = emptyServiceInfo(service); //ServiceManager不包含直接返回,否则更新Service if (!ServiceManager.getInstance().containSingleton(service)) { return result; } //更新Service下的集群新信息 result.setHosts(getAllInstancesFromIndex(service)); //更新服务下的实例信息 serviceDataIndexes.put(service, result); return result; } public void removeData(Service service) { serviceDataIndexes.remove(service); serviceClusterIndex.remove(service); } private ServiceInfo emptyServiceInfo(Service service) { ServiceInfo result = new ServiceInfo(); result.setName(service.getName()); result.setGroupName(service.getGroup()); result.setLastRefTime(System.currentTimeMillis()); result.setCacheMillis(switchDomain.getDefaultPushCacheMillis()); return result; } /** * 获取当前Service下的所有的Instance信息,并更新当前Service下的集群信息 * * @param service service * @return java.util.List */ private List<Instance> getAllInstancesFromIndex(Service service) { Set<Instance> result = new HashSet<>(); Set<String> clusters = new HashSet<>(); for (String each : serviceIndexesManager.getAllClientsRegisteredService(service)) { Optional<InstancePublishInfo> instancePublishInfo = getInstanceInfo(each, service); if (instancePublishInfo.isPresent()) { //获取实例并更新实例的元数据信息 Instance instance = parseInstance(service, instancePublishInfo.get()); result.add(instance); clusters.add(instance.getClusterName()); } } // cache clusters of this service serviceClusterIndex.put(service, clusters); return new LinkedList<>(result); } private Optional<InstancePublishInfo> getInstanceInfo(String clientId, Service service) { Client client = clientManager.getClient(clientId); if (null == client) { return Optional.empty(); } return Optional.ofNullable(client.getInstancePublishInfo(service)); } private Instance parseInstance(Service service, InstancePublishInfo instanceInfo) { Instance result = InstanceUtil.parseToApiInstance(service, instanceInfo); Optional<InstanceMetadata> metadata = metadataManager .getInstanceMetadata(service, instanceInfo.getMetadataId()); metadata.ifPresent(instanceMetadata -> InstanceUtil.updateInstanceMetadata(result, instanceMetadata)); return result; } }

可以看到,其中最重要的是getData方法和getPushData方法,而getData方法又是调用的getPushData方法,getPushData在更新服务下的Service方法的时候调用getAllInstancesFromIndex获取并更新当前Service下的所有的集群信息,这样Service下的所有信息都缓存到ServiceStorage里面了。

Nacos注册相关事件解析

由于Nacos的事件分为了常规事件和慢事件,权限定类名分别为com.alibaba.nacos.common.notify.Event和com.alibaba.nacos.common.notify.SlowEvent,订阅者和发布者也分为了多事件发布者(订阅者)和单事件发布者(订阅者),通知中心为com.alibaba.nacos.common.notify.NotifyCenter,这里不在详细阐述.这里只是简单的介绍一下根实例注册有关的事件类型以及什么时候会触发和谁监听了这个事件,详情见下表。

事件全称 事件作用 触发时机 监听者 com.alibaba.nacos.naming.core. v2.event.client. ClientOperationEvent. ClientRegisterServiceEvent 客户端注册实例事件 1:客户端主动发起请求注册实例的时候 2:一致性协议主动通知更新客户端状态 com.alibaba.nacos.naming.core.v2.index. ClientServiceIndexesManager #handleClientOperation com.alibaba.nacos.naming.core.v2. event.service.ServiceEvent. ServiceChangedEvent 实例变更事件 1:客户端注册实例的时候 2客户端移除已注册的实例的时候3:客户端更新实例元数据的时候 4:客户端心跳处理(只有当实例处于不健康状态下才发布此事件)5:不健康实例检测 1:com.alibaba.nacos.naming.core.v2. upgrade.doublewrite.delay. DoubleWriteEventListene r#onEvent;2:com.alibaba.nacos.naming.push. v2.NamingSubscriberServiceV2Impl#onEvent Nacos执行引擎

前面的事件的出发之后,经过一系列的逻辑之后最终会走到执行引擎这里,执行引擎来执行任务,这里的执行引擎(双写和延迟推送任务)涉及到两个分别是com.alibaba.nacos.naming.push.v2.task.PushDelayTaskExecuteEngine、com.alibaba.nacos.naming.core.v2.upgrade.doublewrite.delay.DoubleWriteDelayTaskEngine

先看一下执行引擎的类图,可以发现这两个执行引擎都是继承了com.alibaba.nacos.common.task.engine.NacosDelayTaskExecuteEngine他们的父类是一模一样的,只是又重新定义了自己的执行逻辑。

双写执行引擎

@Component public class DoubleWriteDelayTaskEngine extends NacosDelayTaskExecuteEngine { public DoubleWriteDelayTaskEngine() { //执行引擎名称和日志打印器 super(DoubleWriteDelayTaskEngine.class.getSimpleName(), Loggers.SRV_LOG); //添加v1版本的任务处理器 addProcessor("v1", new ServiceChangeV1Task.ServiceChangeV1TaskProcessor()); //添加v2版本的任务处理器 addProcessor("v2", new ServiceChangeV2Task.ServiceChangeV2TaskProcessor()); } @Override public NacosTaskProcessor getProcessor(Object key) { String actualKey = key.toString().split(":")[0]; return super.getProcessor(actualKey); } }

根据构造函数可以看出双写执行引擎分别添加了v1和v2两个任务处理器,目的就是保证版本的平滑升级,当我们的集群已经升级且处于稳定状态的时候就可以关闭双写了,这点在Nacos的升级文档中也有提及(Nacos 2.0 升级文档)。

public class PushDelayTaskExecuteEngine extends NacosDelayTaskExecuteEngine { /** * 客户端管理 */ private final ClientManager clientManager; /** * 客户端服务管理器 */ private final ClientServiceIndexesManager indexesManager; /** * 数据存储 */ private final ServiceStorage serviceStorage; /** * 元数据管理 */ private final NamingMetadataManager metadataManager; /** * 执行器 */ private final PushExecutor pushExecutor; private final SwitchDomain switchDomain; public PushDelayTaskExecuteEngine(ClientManager clientManager, ClientServiceIndexesManager indexesManager, ServiceStorage serviceStorage, NamingMetadataManager metadataManager, PushExecutor pushExecutor, SwitchDomain switchDomain) { super(PushDelayTaskExecuteEngine.class.getSimpleName(), Loggers.PUSH); this.clientManager = clientManager; this.indexesManager = indexesManager; this.serviceStorage = serviceStorage; this.metadataManager = metadataManager; this.pushExecutor = pushExecutor; this.switchDomain = switchDomain; //自定义默认的任务处理器 setDefaultTaskProcessor(new PushDelayTaskProcessor(this)); } public ClientManager getClientManager() { return clientManager; } public ClientServiceIndexesManager getIndexesManager() { return indexesManager; } public ServiceStorage getServiceStorage() { return serviceStorage; } public NamingMetadataManager getMetadataManager() { return metadataManager; } public PushExecutor getPushExecutor() { return pushExecutor; } @Override protected void processTasks() { if (!switchDomain.isPushEnabled()) { return; } super.processTasks(); } /** * 自定义默认的处理器 */ private static class PushDelayTaskProcessor implements NacosTaskProcessor { private final PushDelayTaskExecuteEngine executeEngine; public PushDelayTaskProcessor(PushDelayTaskExecuteEngine executeEngine) { this.executeEngine = executeEngine; } @Override public boolean process(NacosTask task) { PushDelayTask pushDelayTask = (PushDelayTask) task; Service service = pushDelayTask.getService(); //任务分派 NamingExecuteTaskDispatcher.getInstance() .dispatchAndExecuteTask(service, new PushExecuteTask(service, executeEngine, pushDelayTask)); return true; } } }

Nacos任务dispatcher

public class NamingExecuteTaskDispatcher { private static final NamingExecuteTaskDispatcher INSTANCE = new NamingExecuteTaskDispatcher(); private final NacosExecuteTaskExecuteEngine executeEngine; private NamingExecuteTaskDispatcher() { //Nacos任务执行引擎 executeEngine = new NacosExecuteTaskExecuteEngine(EnvUtil.FUNCTION_MODE_NAMING, Loggers.SRV_LOG); } public static NamingExecuteTaskDispatcher getInstance() { return INSTANCE; } /** * 执行引擎中添加任务 * * @param dispatchTag 根据dispatchTag决定把任务分配给谁执行 * @param task 任务 */ public void dispatchAndExecuteTask(Object dispatchTag, AbstractExecuteTask task) { executeEngine.addTask(dispatchTag, task); } public String workersStatus() { return executeEngine.workersStatus(); } }

可以看到这里又把任务添加到了Nacos的任务队列中,统一交给了Nacos任务执行引擎来执行任务。

public class NacosExecuteTaskExecuteEngine extends AbstractNacosTaskExecuteEngine<AbstractExecuteTask> { private final TaskExecuteWorker[] executeWorkers; public NacosExecuteTaskExecuteEngine(String name, Logger logger) { this(name, logger, ThreadUtils.getSuitableThreadCount(1)); } public NacosExecuteTaskExecuteEngine(String name, Logger logger, int dispatchWorkerCount) { super(logger); executeWorkers = new TaskExecuteWorker[dispatchWorkerCount]; for (int mod = 0; mod < dispatchWorkerCount; ++mod) { executeWorkers[mod] = new TaskExecuteWorker(name, mod, dispatchWorkerCount, getEngineLog()); } } @Override public int size() { int result = 0; for (TaskExecuteWorker each : executeWorkers) { result += each.pendingTaskCount(); } return result; } @Override public boolean isEmpty() { return 0 == size(); } @Override public void addTask(Object tag, AbstractExecuteTask task) { NacosTaskProcessor processor = getProcessor(tag); if (null != processor) { processor.process(task); return; } TaskExecuteWorker worker = getWorker(tag); worker.process(task); } private TaskExecuteWorker getWorker(Object tag) { int idx = (tag.hashCode() & Integer.MAX_VALUE) % workersCount(); return executeWorkers[idx]; } private int workersCount() { return executeWorkers.length; } @Override public AbstractExecuteTask removeTask(Object key) { throw new UnsupportedOperationException("ExecuteTaskEngine do not support remove task"); } @Override public Collection<Object> getAllTaskKeys() { throw new UnsupportedOperationException("ExecuteTaskEngine do not support get all task keys"); } @Override public void shutdown() throws NacosException { for (TaskExecuteWorker each : executeWorkers) { each.shutdown(); } } /** * Get workers status. * * @return workers status string */ public String workersStatus() { StringBuilder sb = new StringBuilder(); for (TaskExecuteWorker worker : executeWorkers) { sb.append(worker.status()).append('\n'); } return sb.toString(); } }

NacosExecuteTaskExecuteEngine.java

public class NacosExecuteTaskExecuteEngine extends AbstractNacosTaskExecuteEngine<AbstractExecuteTask> { private final TaskExecuteWorker[] executeWorkers; public NacosExecuteTaskExecuteEngine(String name, Logger logger) { //根据计算机的CPU合数计算线程数,大于等于CPU核数*threadMultiple的最小的pow(2,n)的正整数 this(name, logger, ThreadUtils.getSuitableThreadCount(1)); } public NacosExecuteTaskExecuteEngine(String name, Logger logger, int dispatchWorkerCount) { super(logger); executeWorkers = new TaskExecuteWorker[dispatchWorkerCount]; for (int mod = 0; mod < dispatchWorkerCount; ++mod) { executeWorkers[mod] = new TaskExecuteWorker(name, mod, dispatchWorkerCount, getEngineLog()); } } @Override public int size() { int result = 0; for (TaskExecuteWorker each : executeWorkers) { result += each.pendingTaskCount(); } return result; } @Override public boolean isEmpty() { return 0 == size(); } @Override public void addTask(Object tag, AbstractExecuteTask task) { NacosTaskProcessor processor = getProcessor(tag); //如果有自定义的处理器的话,则使用自定义处理器执行任务 if (null != processor) { processor.process(task); return; } //获取执行任务的worker TaskExecuteWorker worker = getWorker(tag); //执行任务 worker.process(task); } /** * 根据tag判断哪一个worker来执行任务 * * @param tag tag * @return worker */ private TaskExecuteWorker getWorker(Object tag) { //保证得到的idx为0~workersCount的数 int idx = (tag.hashCode() & Integer.MAX_VALUE) % workersCount(); return executeWorkers[idx]; } private int workersCount() { return executeWorkers.length; } @Override public AbstractExecuteTask removeTask(Object key) { throw new UnsupportedOperationException("ExecuteTaskEngine do not support remove task"); } @Override public Collection<Object> getAllTaskKeys() { throw new UnsupportedOperationException("ExecuteTaskEngine do not support get all task keys"); } @Override public void shutdown() throws NacosException { for (TaskExecuteWorker each : executeWorkers) { each.shutdown(); } } /** * Get workers status. * * @return workers status string */ public String workersStatus() { StringBuilder sb = new StringBuilder(); for (TaskExecuteWorker worker : executeWorkers) { sb.append(worker.status()).append('\n'); } return sb.toString(); } }

可以看到该类的主要作用就是根据CPU的核数计算线程数,然后获取对应的worker执行任务,在addTask方法中的获取分派worker,最后执行任务worker.process(task);执行任务的时候我们自然而然可以想到里面是一个线程,然后死循环从阻塞队列中获取任务执行任务。

TaskExecuteWorker.java

public final class TaskExecuteWorker implements NacosTaskProcessor, Closeable { /** * Max task queue size 32768. */ private static final int QUEUE_CAPACITY = 1 << 15; private final Logger log; private final String name; /** * 阻塞队列 */ private final BlockingQueue<Runnable> queue; private final AtomicBoolean closed; public TaskExecuteWorker(final String name, final int mod, final int total) { this(name, mod, total, null); } public TaskExecuteWorker(final String name, final int mod, final int total, final Logger logger) { this.name = name + "_" + mod + "%" + total; this.queue = new ArrayBlockingQueue<Runnable>(QUEUE_CAPACITY); this.closed = new AtomicBoolean(false); this.log = null == logger ? LoggerFactory.getLogger(TaskExecuteWorker.class) : logger; //开启线程 new InnerWorker(name).start(); } public String getName() { return name; } /** * 任务放入队列 * * @param task task * @return 是否放入阻塞队列成功 */ @Override public boolean process(NacosTask task) { if (task instanceof AbstractExecuteTask) { putTask((Runnable) task); } return true; } private void putTask(Runnable task) { try { queue.put(task); } catch (InterruptedException ire) { log.error(ire.toString(), ire); } } public int pendingTaskCount() { return queue.size(); } /** * Worker status. */ public String status() { return name + ", pending tasks: " + pendingTaskCount(); } @Override public void shutdown() throws NacosException { queue.clear(); closed.compareAndSet(false, true); } /** * 任务执行器,一直循环从阻塞队列获取任务然后执行 */ private class InnerWorker extends Thread { InnerWorker(String name) { setDaemon(false); setName(name); } @Override public void run() { while (!closed.get()) { try { Runnable task = queue.take(); long begin = System.currentTimeMillis(); task.run(); long duration = System.currentTimeMillis() - begin; if (duration > 1000L) { log.warn("task {} takes {}ms", task, duration); } } catch (Throwable e) { log.error("[TASK-FAILED] " + e.toString(), e); } } } } } 总结

本篇文章描述了Nacos1.X和2.X的区别以及2.X相对于1.X的优势,并且从源码的方面解析了为什么Nacos2.X的优势。通过本篇章可以学习到Nacos的设计,并且对我们自己码代码的时候也是很有帮助的,可以看到里面用到了很多的设计模式(模板,观察者,委托,代理,单例,工厂,策略),其中也通过事件(观察者模式)来解耦,使用异步编程方式来尽可能的提高程序的相应,相信仔细阅读源码之后对我们以后代码的设计和多线程编程都会有很大的提升。

本文共计4752个文字,预计阅读时间需要20分钟。

Nacos2.X源码阅读有哪些关键点总结?

前言+Nacos是阿里巴巴出品的一个高性能微服务时代产物,其核心在于集中注册中心和配置中心。那么Nacos为什么这么高性能呢?总结以下几点;

1:基于阿里自研的distro协议进行数据传输,提高了数据同步的效率;

2:采用高效的数据结构,如跳表,优化了查找和存储性能;

3:采用异步处理机制,减少了同步调用带来的性能损耗;

4:支持集群部署,提高了系统的可用性和伸缩性。

前言

Nacos是一个Alibaba出品的高性能微服务时代产出的组件,集注册和配置中心为一体。那么Nacos为什么这么高性能呢?总结以下几点;

1:基于阿里自研的distro协议进行Nacos把不同节点的数据同步

2:大量使用线程池和异步的方式提高API的响应速度

3:2.X使用grpc长连接的方式取代了1.X需要一直发送心跳包导出服务器CPU占用较高的问题

同时2.X也对1.X做了重大的升级,无论是从架构层面还是代码层面都做了重大的升级,有条件升级为2.X的同学建议客户端可服务端一起升级,这样才能更大限度的发挥出2.X架构的优势。1.X和2.X的对比 如下:

1.X 2.X 连接方式 Http短连接 GRpc、Http短连接(兼容1.X) 推送方式 UDP GRpc 健康检测方式 Http短连接定时心跳包 Grpc长连接(轻量级心跳包)

关于Nacos1.X和2.X的性能对比请参考:Nacos 2.0 升级前后性能对比压测-阿里云开发者社区 (aliyun.com)

这里借用一下阿里云社区的Nacos架构图:

下面我们就基于Nacos2.0.4的代码层面分析一下为什么Nacos源码,看之前最好有以下基础,设计模式(模板,委托,代理,单例,工厂,策略)、异步编程方式,grpc。

启动

首先我们先看一下Nacos的结构图:Nacos通过Namespace(命名空间)进行环境的隔离,然后我们可以把根据服务之间的关联性来把不同的服务划分到不同的组(Group)之间,每一个组之间可以有多个服务(Service),同时为了容灾,我们可以把一个服务划分为不同的集群(Cluster)部署在不同的地区或机房,每一个具体的集群下就是我们一个个实例(Instance)了,也就是我们开发的微服务项目。

由于Nacos 中很多都是用异步方式来处理的,所以我们很多的时候不能直接采用流程的方式来阅读代码,阅读的时候会来回的跳转,异步事件的方式编程相对来说复杂了很多,首先我们先看一下Nacos的启动过程,后续我贴代码的时候只贴关键代码,其他就略去了,后续不在重复。

下面看一下处理请求的事件和监听的逻辑

该类com.alibaba.nacos.core.remote.RequestHandlerRegistry监听了ContextRefreshedEvent事件,那么SpringBoot启动之后就是自动执行我们需要处理的逻辑。

/** * RequestHandlerRegistry. * 当Spring初始化完成之后,加载com.alibaba.nacos.core.remote.RequestHandler,注册为事件监听器 * * @author liuzunfei * @version $Id: RequestHandlerRegistry.java, v 0.1 2020年07月13日 8:24 PM liuzunfei Exp $ */ @Service public class RequestHandlerRegistry implements ApplicationListener<ContextRefreshedEvent> { /** * 请求处理器集合 */ Map<String, RequestHandler> registryHandlers = new HashMap<String, RequestHandler>(); @Autowired private TpsMonitorManager tpsMonitorManager; /** * Get Request Handler By request Type. * * @param requestType see definitions of sub constants classes of RequestTypeConstants * @return request handler. */ public RequestHandler getByRequestType(String requestType) { return registryHandlers.get(requestType); } /** * 此监听器的主要作用就是加载com.alibaba.nacos.core.remote.RequestHandler的子类到registryHandlers, * 后续做请求处理使用,可以看做是策略模式的一个体现 * * @param event event */ @Override public void onApplicationEvent(ContextRefreshedEvent event) { Map<String, RequestHandler> beansOfType = event.getApplicationContext().getBeansOfType(RequestHandler.class); Collection<RequestHandler> values = beansOfType.values(); for (RequestHandler requestHandler : values) { Class<?> clazz = requestHandler.getClass(); boolean skip = false; while (!clazz.getSuperclass().equals(RequestHandler.class)) { if (clazz.getSuperclass().equals(Object.class)) { skip = true; break; } clazz = clazz.getSuperclass(); } if (skip) { continue; } try { Method method = clazz.getMethod("handle", Request.class, RequestMeta.class); //需要TPS监控的类加入到tpsMonitorManager集合中 if (method.isAnnotationPresent(TpsControl.class) && TpsControlConfig.isTpsControlEnabled()) { TpsControl tpsControl = method.getAnnotation(TpsControl.class); String pointName = tpsControl.pointName(); TpsMonitorPoint tpsMonitorPoint = new TpsMonitorPoint(pointName); tpsMonitorManager.registerTpsControlPoint(tpsMonitorPoint); } } catch (Exception e) { //ignore. } Class tClass = (Class) ((ParameterizedType) clazz.getGenericSuperclass()).getActualTypeArguments()[0]; //添加处理器到集合中 registryHandlers.putIfAbsent(tClass.getSimpleName(), requestHandler); } } }

我们可以看到com.alibaba.nacos.core.remote.RequestHandler的实现类有好多,我们大致可以从名称就可以看出每一类的作用

com.alibaba.nacos.core.remote.RequestHandler子类名称 作用 com.alibaba.nacos.config.server.remote.ConfigChangeBatchListenRequestHandler 节点之间配置互相同步的处理器 com.alibaba.nacos.config.server.remote.ConfigChangeBatchListenRequestHandler 配置改变监听处理器 com.alibaba.nacos.config.server.remote.ConfigPublishRequestHandler 配置发布监听处理器 com.alibaba.nacos.config.server.remote.ConfigQueryRequestHandler 配置查询请求处理器 com.alibaba.nacos.config.server.remote.ConfigRemoveRequestHandler 配置移除请求处理器 com.alibaba.nacos.naming.remote.rpc.handler.DistroDataRequestHandler distro一致性服务处理器(节点点同步数据) com.alibaba.nacos.core.remote.HealthCheckRequestHandler 健康检查处理器 com.alibaba.nacos.naming.remote.rpc.handler.InstanceRequestHandler 实例注册,移除处理器 com.alibaba.nacos.core.remote.core.ServerLoaderInfoRequestHandler 服务信息加载处理器 com.alibaba.nacos.naming.remote.rpc.handler.ServiceListRequestHandler 服务列表请求处理器 com.alibaba.nacos.naming.remote.rpc.handler.ServiceQueryRequestHandler 服务查询处理器 Service的注册流程

io.grpc.stub.StreamObserver#onNext中启动了一个Acceptor,用来监听客户端的GRpc连接,当有客户端连接的时候,就会通过connectionManager.register(connectionId, connection)注册实例,然后会通过客户端注册连接器发布连接事件clientConnectionEventListenerRegistry.notifyClientConnected(connection);然后就会由监听事件实现具体的建立连接的逻辑,建立完成之后才会进行注册逻辑的执行。

com.alibaba.nacos.core.remote.ClientConnectionEventListenerRegistry:客户端连接Naocs事件注册器 目前已知的注册器都继承了com.alibaba.nacos.core.remote.ClientConnectionEventListener //代码目前为空,可能是为以后扩展使用 com.alibaba.nacos.config.server.remote.ConfigConnectionEventListener //用来管理客户端的连接,可以进行连接,断开连接,验证连接是否有效等操作,其内部有一个线程池定时清除无效的连接 com.alibaba.nacos.naming.core.v2.client.manager.impl.ConnectionBasedClientManager //grpc回调初始化以及清理监听器 com.alibaba.nacos.core.remote.core.RpcAckCallbackInitorOrCleaner

下面我们就使用com.alibaba.nacos.naming.remote.rpc.handler.InstanceRequestHandler服务注册请求分析一下该类的处理流程。

//定义请求的处理流程 public abstract class RequestHandler<T extends Request, S extends Response> { @Autowired private RequestFilters requestFilters; public Response handleRequest(T request, RequestMeta meta) throws NacosException { for (AbstractRequestFilter filter : requestFilters.filters) { try { Response filterResult = filter.filter(request, meta, this.getClass()); if (filterResult != null && !filterResult.isSuccess()) { return filterResult; } } catch (Throwable throwable) { Loggers.REMOTE.error("filter error", throwable); } } return handle(request, meta); } public abstract S handle(T request, RequestMeta meta) throws NacosException; }

@Component public class InstanceRequestHandler extends RequestHandler<InstanceRequest, InstanceResponse> { private final EphemeralClientOperationServiceImpl clientOperationService; public InstanceRequestHandler(EphemeralClientOperationServiceImpl clientOperationService) { this.clientOperationService = clientOperationService; } @Override @Secured(action = ActionTypes.WRITE, parser = NamingResourceParser.class) public InstanceResponse handle(InstanceRequest request, RequestMeta meta) throws NacosException { Service service = Service .newService(request.getNamespace(), request.getGroupName(), request.getServiceName(), true); switch (request.getType()) { case NamingRemoteConstants.REGISTER_INSTANCE: return registerInstance(service, request, meta); case NamingRemoteConstants.DE_REGISTER_INSTANCE: return deregisterInstance(service, request, meta); default: throw new NacosException(NacosException.INVALID_PARAM, String.format("Unsupported request type %s", request.getType())); } } /** * 委托给com.alibaba.nacos.naming.remote.rpc.handler.InstanceRequestHandler#clientOperationService来进行实例注册 * * @param service service * @param request request * @param meta meta * @return com.alibaba.nacos.api.naming.remote.response.InstanceResponse */ private InstanceResponse registerInstance(Service service, InstanceRequest request, RequestMeta meta) { clientOperationService.registerInstance(service, request.getInstance(), meta.getConnectionId()); return new InstanceResponse(NamingRemoteConstants.REGISTER_INSTANCE); } private InstanceResponse deregisterInstance(Service service, InstanceRequest request, RequestMeta meta) { clientOperationService.deregisterInstance(service, request.getInstance(), meta.getConnectionId()); return new InstanceResponse(NamingRemoteConstants.DE_REGISTER_INSTANCE); } }

可以看到InstanceRequestHandler继承了RequestHandler,父类在handleRequest定义好了请求的处理流程,最后具体的处理逻辑交给子类去实现,这就是一个典型模板设计模式的实现,可以看到子类根据request.getType()又把具体的处理成分为了注册实例和取消注册实例,然后又委托给了com.alibaba.nacos.naming.core.v2.service.impl.EphemeralClientOperationServiceImpl去处理具体的注册实例和取消注册实例的逻辑。

我们都知道Nacos的实例分为了Ephemeral和Persistent两种实例,而默认的都是Ephemeral,这里直接注册为EphemeralClientOperationServiceImpl的Bean而不是采用ClientOperationServiceProxy代理的方式,因为是Persistent的实例是的处理逻辑不在这里。

走了这么多步骤,终于到了注册实例的真正流程了

/** * 注册实例 * * @param service service * @param instance instance * @param clientId connectionId */ @Override public void registerInstance(Service service, Instance instance, String clientId) { //获取服务,如果如果已存在的话,替换掉旧的Service(namespace,group,name) Service singleton = ServiceManager.getInstance().getSingleton(service); if (!singleton.isEphemeral()) { throw new NacosRuntimeException(NacosException.INVALID_PARAM, String.format("Current service %s is persistent service, can't register ephemeral instance.", singleton.getGroupedServiceName())); } //获取client Client client = clientManager.getClient(clientId); if (!clientIsLegal(client, clientId)) { return; } //创建一个实例 InstancePublishInfo instanceInfo = getPublishInfo(instance); //把Service和instanceInfo缓存到连接的客户端里面,然后发布客户端变更事件 client.addServiceInstance(singleton, instanceInfo); //更新最后更新时间 client.setLastUpdatedTime(); //发布注册服务事件 NotifyCenter.publishEvent(new ClientOperationEvent.ClientRegisterServiceEvent(singleton, clientId)); //发布元数据更新事件(matadataId=>ip:port:clusterName) NotifyCenter .publishEvent(new MetadataEvent.InstanceMetadataEvent(singleton, instanceInfo.getMetadataId(), false)); }

可以看到注册流程里面分别有获取并替换旧服务,如果不存在的话就创建一个新的,然后根据ClientId获取对应的Client,然后根据创建一个InstanceInfo,添加Service和InstanceInfo到ClientManager里面,最后发布了两个事件。然后呢?这就完了?数据存到哪里了?注册哪里去了?刚开始,我也是带着这一系列的疑问,不知道数据存到哪里去了,后面通过根据控制台界面请求的接口/nacos/v1/ns/catalog/services发现该接口的数据都是从一个叫做ServiceStorage的里面读过来的,然后通过答案找问题的思路找到了在发布事件之后的一系列操作之后存在执行引擎中进行了数据存储操作。

Nacos2.X源码阅读有哪些关键点总结?

Nacos数据存储 ServiceStorage.java

/** * Service storage. * * @author xiweng.yy */ @Component public class ServiceStorage { /** * 客户单连接注册服务索引关注 */ private final ClientServiceIndexesManager serviceIndexesManager; private final ClientManager clientManager; private final SwitchDomain switchDomain; private final NamingMetadataManager metadataManager; /** * 服务信息 */ private final ConcurrentMap<Service, ServiceInfo> serviceDataIndexes; /** * 集群索引管理 key:value=>Service:Set(ClusterName) */ private final ConcurrentMap<Service, Set<String>> serviceClusterIndex; public ServiceStorage(ClientServiceIndexesManager serviceIndexesManager, ClientManagerDelegate clientManager, SwitchDomain switchDomain, NamingMetadataManager metadataManager) { this.serviceIndexesManager = serviceIndexesManager; this.clientManager = clientManager; this.switchDomain = switchDomain; this.metadataManager = metadataManager; this.serviceDataIndexes = new ConcurrentHashMap<>(); this.serviceClusterIndex = new ConcurrentHashMap<>(); } /** * 获取当前服务下的集群信息 * * @param service service * @return java.util.Set */ public Set<String> getClusters(Service service) { return serviceClusterIndex.getOrDefault(service, new HashSet<>()); } /** * 获取服务的数据信息 * * @param service service * @return com.alibaba.nacos.api.naming.pojo.ServiceInfo */ public ServiceInfo getData(Service service) { return serviceDataIndexes.containsKey(service) ? serviceDataIndexes.get(service) : getPushData(service); } /** * 若com.alibaba.nacos.naming.core.v2.ServiceManager不存在,则直接返回,已存在的话更新当前Service下的Cluster和Instance信息 * * @param service service * @return com.alibaba.nacos.api.naming.pojo.ServiceInfo */ public ServiceInfo getPushData(Service service) { ServiceInfo result = emptyServiceInfo(service); //ServiceManager不包含直接返回,否则更新Service if (!ServiceManager.getInstance().containSingleton(service)) { return result; } //更新Service下的集群新信息 result.setHosts(getAllInstancesFromIndex(service)); //更新服务下的实例信息 serviceDataIndexes.put(service, result); return result; } public void removeData(Service service) { serviceDataIndexes.remove(service); serviceClusterIndex.remove(service); } private ServiceInfo emptyServiceInfo(Service service) { ServiceInfo result = new ServiceInfo(); result.setName(service.getName()); result.setGroupName(service.getGroup()); result.setLastRefTime(System.currentTimeMillis()); result.setCacheMillis(switchDomain.getDefaultPushCacheMillis()); return result; } /** * 获取当前Service下的所有的Instance信息,并更新当前Service下的集群信息 * * @param service service * @return java.util.List */ private List<Instance> getAllInstancesFromIndex(Service service) { Set<Instance> result = new HashSet<>(); Set<String> clusters = new HashSet<>(); for (String each : serviceIndexesManager.getAllClientsRegisteredService(service)) { Optional<InstancePublishInfo> instancePublishInfo = getInstanceInfo(each, service); if (instancePublishInfo.isPresent()) { //获取实例并更新实例的元数据信息 Instance instance = parseInstance(service, instancePublishInfo.get()); result.add(instance); clusters.add(instance.getClusterName()); } } // cache clusters of this service serviceClusterIndex.put(service, clusters); return new LinkedList<>(result); } private Optional<InstancePublishInfo> getInstanceInfo(String clientId, Service service) { Client client = clientManager.getClient(clientId); if (null == client) { return Optional.empty(); } return Optional.ofNullable(client.getInstancePublishInfo(service)); } private Instance parseInstance(Service service, InstancePublishInfo instanceInfo) { Instance result = InstanceUtil.parseToApiInstance(service, instanceInfo); Optional<InstanceMetadata> metadata = metadataManager .getInstanceMetadata(service, instanceInfo.getMetadataId()); metadata.ifPresent(instanceMetadata -> InstanceUtil.updateInstanceMetadata(result, instanceMetadata)); return result; } }

可以看到,其中最重要的是getData方法和getPushData方法,而getData方法又是调用的getPushData方法,getPushData在更新服务下的Service方法的时候调用getAllInstancesFromIndex获取并更新当前Service下的所有的集群信息,这样Service下的所有信息都缓存到ServiceStorage里面了。

Nacos注册相关事件解析

由于Nacos的事件分为了常规事件和慢事件,权限定类名分别为com.alibaba.nacos.common.notify.Event和com.alibaba.nacos.common.notify.SlowEvent,订阅者和发布者也分为了多事件发布者(订阅者)和单事件发布者(订阅者),通知中心为com.alibaba.nacos.common.notify.NotifyCenter,这里不在详细阐述.这里只是简单的介绍一下根实例注册有关的事件类型以及什么时候会触发和谁监听了这个事件,详情见下表。

事件全称 事件作用 触发时机 监听者 com.alibaba.nacos.naming.core. v2.event.client. ClientOperationEvent. ClientRegisterServiceEvent 客户端注册实例事件 1:客户端主动发起请求注册实例的时候 2:一致性协议主动通知更新客户端状态 com.alibaba.nacos.naming.core.v2.index. ClientServiceIndexesManager #handleClientOperation com.alibaba.nacos.naming.core.v2. event.service.ServiceEvent. ServiceChangedEvent 实例变更事件 1:客户端注册实例的时候 2客户端移除已注册的实例的时候3:客户端更新实例元数据的时候 4:客户端心跳处理(只有当实例处于不健康状态下才发布此事件)5:不健康实例检测 1:com.alibaba.nacos.naming.core.v2. upgrade.doublewrite.delay. DoubleWriteEventListene r#onEvent;2:com.alibaba.nacos.naming.push. v2.NamingSubscriberServiceV2Impl#onEvent Nacos执行引擎

前面的事件的出发之后,经过一系列的逻辑之后最终会走到执行引擎这里,执行引擎来执行任务,这里的执行引擎(双写和延迟推送任务)涉及到两个分别是com.alibaba.nacos.naming.push.v2.task.PushDelayTaskExecuteEngine、com.alibaba.nacos.naming.core.v2.upgrade.doublewrite.delay.DoubleWriteDelayTaskEngine

先看一下执行引擎的类图,可以发现这两个执行引擎都是继承了com.alibaba.nacos.common.task.engine.NacosDelayTaskExecuteEngine他们的父类是一模一样的,只是又重新定义了自己的执行逻辑。

双写执行引擎

@Component public class DoubleWriteDelayTaskEngine extends NacosDelayTaskExecuteEngine { public DoubleWriteDelayTaskEngine() { //执行引擎名称和日志打印器 super(DoubleWriteDelayTaskEngine.class.getSimpleName(), Loggers.SRV_LOG); //添加v1版本的任务处理器 addProcessor("v1", new ServiceChangeV1Task.ServiceChangeV1TaskProcessor()); //添加v2版本的任务处理器 addProcessor("v2", new ServiceChangeV2Task.ServiceChangeV2TaskProcessor()); } @Override public NacosTaskProcessor getProcessor(Object key) { String actualKey = key.toString().split(":")[0]; return super.getProcessor(actualKey); } }

根据构造函数可以看出双写执行引擎分别添加了v1和v2两个任务处理器,目的就是保证版本的平滑升级,当我们的集群已经升级且处于稳定状态的时候就可以关闭双写了,这点在Nacos的升级文档中也有提及(Nacos 2.0 升级文档)。

public class PushDelayTaskExecuteEngine extends NacosDelayTaskExecuteEngine { /** * 客户端管理 */ private final ClientManager clientManager; /** * 客户端服务管理器 */ private final ClientServiceIndexesManager indexesManager; /** * 数据存储 */ private final ServiceStorage serviceStorage; /** * 元数据管理 */ private final NamingMetadataManager metadataManager; /** * 执行器 */ private final PushExecutor pushExecutor; private final SwitchDomain switchDomain; public PushDelayTaskExecuteEngine(ClientManager clientManager, ClientServiceIndexesManager indexesManager, ServiceStorage serviceStorage, NamingMetadataManager metadataManager, PushExecutor pushExecutor, SwitchDomain switchDomain) { super(PushDelayTaskExecuteEngine.class.getSimpleName(), Loggers.PUSH); this.clientManager = clientManager; this.indexesManager = indexesManager; this.serviceStorage = serviceStorage; this.metadataManager = metadataManager; this.pushExecutor = pushExecutor; this.switchDomain = switchDomain; //自定义默认的任务处理器 setDefaultTaskProcessor(new PushDelayTaskProcessor(this)); } public ClientManager getClientManager() { return clientManager; } public ClientServiceIndexesManager getIndexesManager() { return indexesManager; } public ServiceStorage getServiceStorage() { return serviceStorage; } public NamingMetadataManager getMetadataManager() { return metadataManager; } public PushExecutor getPushExecutor() { return pushExecutor; } @Override protected void processTasks() { if (!switchDomain.isPushEnabled()) { return; } super.processTasks(); } /** * 自定义默认的处理器 */ private static class PushDelayTaskProcessor implements NacosTaskProcessor { private final PushDelayTaskExecuteEngine executeEngine; public PushDelayTaskProcessor(PushDelayTaskExecuteEngine executeEngine) { this.executeEngine = executeEngine; } @Override public boolean process(NacosTask task) { PushDelayTask pushDelayTask = (PushDelayTask) task; Service service = pushDelayTask.getService(); //任务分派 NamingExecuteTaskDispatcher.getInstance() .dispatchAndExecuteTask(service, new PushExecuteTask(service, executeEngine, pushDelayTask)); return true; } } }

Nacos任务dispatcher

public class NamingExecuteTaskDispatcher { private static final NamingExecuteTaskDispatcher INSTANCE = new NamingExecuteTaskDispatcher(); private final NacosExecuteTaskExecuteEngine executeEngine; private NamingExecuteTaskDispatcher() { //Nacos任务执行引擎 executeEngine = new NacosExecuteTaskExecuteEngine(EnvUtil.FUNCTION_MODE_NAMING, Loggers.SRV_LOG); } public static NamingExecuteTaskDispatcher getInstance() { return INSTANCE; } /** * 执行引擎中添加任务 * * @param dispatchTag 根据dispatchTag决定把任务分配给谁执行 * @param task 任务 */ public void dispatchAndExecuteTask(Object dispatchTag, AbstractExecuteTask task) { executeEngine.addTask(dispatchTag, task); } public String workersStatus() { return executeEngine.workersStatus(); } }

可以看到这里又把任务添加到了Nacos的任务队列中,统一交给了Nacos任务执行引擎来执行任务。

public class NacosExecuteTaskExecuteEngine extends AbstractNacosTaskExecuteEngine<AbstractExecuteTask> { private final TaskExecuteWorker[] executeWorkers; public NacosExecuteTaskExecuteEngine(String name, Logger logger) { this(name, logger, ThreadUtils.getSuitableThreadCount(1)); } public NacosExecuteTaskExecuteEngine(String name, Logger logger, int dispatchWorkerCount) { super(logger); executeWorkers = new TaskExecuteWorker[dispatchWorkerCount]; for (int mod = 0; mod < dispatchWorkerCount; ++mod) { executeWorkers[mod] = new TaskExecuteWorker(name, mod, dispatchWorkerCount, getEngineLog()); } } @Override public int size() { int result = 0; for (TaskExecuteWorker each : executeWorkers) { result += each.pendingTaskCount(); } return result; } @Override public boolean isEmpty() { return 0 == size(); } @Override public void addTask(Object tag, AbstractExecuteTask task) { NacosTaskProcessor processor = getProcessor(tag); if (null != processor) { processor.process(task); return; } TaskExecuteWorker worker = getWorker(tag); worker.process(task); } private TaskExecuteWorker getWorker(Object tag) { int idx = (tag.hashCode() & Integer.MAX_VALUE) % workersCount(); return executeWorkers[idx]; } private int workersCount() { return executeWorkers.length; } @Override public AbstractExecuteTask removeTask(Object key) { throw new UnsupportedOperationException("ExecuteTaskEngine do not support remove task"); } @Override public Collection<Object> getAllTaskKeys() { throw new UnsupportedOperationException("ExecuteTaskEngine do not support get all task keys"); } @Override public void shutdown() throws NacosException { for (TaskExecuteWorker each : executeWorkers) { each.shutdown(); } } /** * Get workers status. * * @return workers status string */ public String workersStatus() { StringBuilder sb = new StringBuilder(); for (TaskExecuteWorker worker : executeWorkers) { sb.append(worker.status()).append('\n'); } return sb.toString(); } }

NacosExecuteTaskExecuteEngine.java

public class NacosExecuteTaskExecuteEngine extends AbstractNacosTaskExecuteEngine<AbstractExecuteTask> { private final TaskExecuteWorker[] executeWorkers; public NacosExecuteTaskExecuteEngine(String name, Logger logger) { //根据计算机的CPU合数计算线程数,大于等于CPU核数*threadMultiple的最小的pow(2,n)的正整数 this(name, logger, ThreadUtils.getSuitableThreadCount(1)); } public NacosExecuteTaskExecuteEngine(String name, Logger logger, int dispatchWorkerCount) { super(logger); executeWorkers = new TaskExecuteWorker[dispatchWorkerCount]; for (int mod = 0; mod < dispatchWorkerCount; ++mod) { executeWorkers[mod] = new TaskExecuteWorker(name, mod, dispatchWorkerCount, getEngineLog()); } } @Override public int size() { int result = 0; for (TaskExecuteWorker each : executeWorkers) { result += each.pendingTaskCount(); } return result; } @Override public boolean isEmpty() { return 0 == size(); } @Override public void addTask(Object tag, AbstractExecuteTask task) { NacosTaskProcessor processor = getProcessor(tag); //如果有自定义的处理器的话,则使用自定义处理器执行任务 if (null != processor) { processor.process(task); return; } //获取执行任务的worker TaskExecuteWorker worker = getWorker(tag); //执行任务 worker.process(task); } /** * 根据tag判断哪一个worker来执行任务 * * @param tag tag * @return worker */ private TaskExecuteWorker getWorker(Object tag) { //保证得到的idx为0~workersCount的数 int idx = (tag.hashCode() & Integer.MAX_VALUE) % workersCount(); return executeWorkers[idx]; } private int workersCount() { return executeWorkers.length; } @Override public AbstractExecuteTask removeTask(Object key) { throw new UnsupportedOperationException("ExecuteTaskEngine do not support remove task"); } @Override public Collection<Object> getAllTaskKeys() { throw new UnsupportedOperationException("ExecuteTaskEngine do not support get all task keys"); } @Override public void shutdown() throws NacosException { for (TaskExecuteWorker each : executeWorkers) { each.shutdown(); } } /** * Get workers status. * * @return workers status string */ public String workersStatus() { StringBuilder sb = new StringBuilder(); for (TaskExecuteWorker worker : executeWorkers) { sb.append(worker.status()).append('\n'); } return sb.toString(); } }

可以看到该类的主要作用就是根据CPU的核数计算线程数,然后获取对应的worker执行任务,在addTask方法中的获取分派worker,最后执行任务worker.process(task);执行任务的时候我们自然而然可以想到里面是一个线程,然后死循环从阻塞队列中获取任务执行任务。

TaskExecuteWorker.java

public final class TaskExecuteWorker implements NacosTaskProcessor, Closeable { /** * Max task queue size 32768. */ private static final int QUEUE_CAPACITY = 1 << 15; private final Logger log; private final String name; /** * 阻塞队列 */ private final BlockingQueue<Runnable> queue; private final AtomicBoolean closed; public TaskExecuteWorker(final String name, final int mod, final int total) { this(name, mod, total, null); } public TaskExecuteWorker(final String name, final int mod, final int total, final Logger logger) { this.name = name + "_" + mod + "%" + total; this.queue = new ArrayBlockingQueue<Runnable>(QUEUE_CAPACITY); this.closed = new AtomicBoolean(false); this.log = null == logger ? LoggerFactory.getLogger(TaskExecuteWorker.class) : logger; //开启线程 new InnerWorker(name).start(); } public String getName() { return name; } /** * 任务放入队列 * * @param task task * @return 是否放入阻塞队列成功 */ @Override public boolean process(NacosTask task) { if (task instanceof AbstractExecuteTask) { putTask((Runnable) task); } return true; } private void putTask(Runnable task) { try { queue.put(task); } catch (InterruptedException ire) { log.error(ire.toString(), ire); } } public int pendingTaskCount() { return queue.size(); } /** * Worker status. */ public String status() { return name + ", pending tasks: " + pendingTaskCount(); } @Override public void shutdown() throws NacosException { queue.clear(); closed.compareAndSet(false, true); } /** * 任务执行器,一直循环从阻塞队列获取任务然后执行 */ private class InnerWorker extends Thread { InnerWorker(String name) { setDaemon(false); setName(name); } @Override public void run() { while (!closed.get()) { try { Runnable task = queue.take(); long begin = System.currentTimeMillis(); task.run(); long duration = System.currentTimeMillis() - begin; if (duration > 1000L) { log.warn("task {} takes {}ms", task, duration); } } catch (Throwable e) { log.error("[TASK-FAILED] " + e.toString(), e); } } } } } 总结

本篇文章描述了Nacos1.X和2.X的区别以及2.X相对于1.X的优势,并且从源码的方面解析了为什么Nacos2.X的优势。通过本篇章可以学习到Nacos的设计,并且对我们自己码代码的时候也是很有帮助的,可以看到里面用到了很多的设计模式(模板,观察者,委托,代理,单例,工厂,策略),其中也通过事件(观察者模式)来解耦,使用异步编程方式来尽可能的提高程序的相应,相信仔细阅读源码之后对我们以后代码的设计和多线程编程都会有很大的提升。