本文基于sleuth 2.2.5版本

sleuth是一个链路追踪工具，通过它在日志中打印的信息可以分析出一个服务的调用链条，也可以得出链条中每个服务的耗时，这为我们在实际生产中，分析超时服务，分析服务调用关系，做服务治理提供帮助。
第一次使用sleuth，虽说跟着网上的教程也可以运行出正确的结果，但是对于原理、更进一步的使用还是一头蒙。我就尝试着分析一下源代码，其代码量并不大，但是代码还真是难懂，看了一段时间源码，并从网上找了资料，只是对原理、部分类的作用有了一些了解，我通过本文做一下介绍。

文章目录

一、概念介绍
二、场景描述
三、原理解析
- 1、spring.factories文件
- 2、处理日志打印
- 3、TracingFilter过滤器
- 4、拦截RestTemplate
四、链路信息抽样
五、总结

一、概念介绍

先说几个概念。
span：span是sleuth中最基本的工作单元，一个微服务收到请求后会创建一个span同时产生一个span id，span id是一个64位的随机数，sleuth将其转化为16进制的字符串，打印在日志里面。其对应的实现类是RealSpan。
trace id：在一个调用链条中，trace id是始终不变的，每经过一个微服务span id生成一个新的，所以通过trace id可以找出调用链上所有经过的微服务。trace id默认是64位，可以通过spring.sleuth.traceId128=true设置trace id为128位。调用链的第一个服务，其span id和trace id是同一个值。

sleuth目前并不是对所有调用访问都可以做链路追踪，它目前支持的有：rxjava、feign、quartz、RestTemplate、zuul、hystrix、grpc、kafka、Opentracing、redis、Reator、circuitbreaker、spring的Scheduled。国内用的比较多的dubbo，sleuth无法对其提供支持。

二、场景描述

本文以http访问介绍一下sleuth原理。本文介绍的场景是从浏览器发起start请求，然后在服务中通过RestTemplate访问另一个服务end。代码如下：

@RestController
public class TestController {
    private static Logger log = LoggerFactory.getLogger(TestController.class);
    @Autowired
    private RestTemplate restTemplate;
    @RequestMapping("start")
    public String start(){
        log.info("start收到请求");
        restTemplate.getForObject("http://localhost:8081/end",String.class);
        log.info("start请求处理结束");
        return "1";
    }
    @RequestMapping("end")
    public String end(){
        log.info("end收到请求");
        log.info("end请求处理结束");
        return "2";
    }
    @Bean
    public RestTemplate getRestTemplate(){
        return new RestTemplate();
    }
}

下面按照该场景介绍一下sleuth如何执行的。

三、原理解析

1、spring.factories文件

spring boot启动时，需要执行自动配置类。自动配置类都在sleuth-core.jar包的spring.factories文件中。

org.springframework.boot.autoconfigure.EnableAutoConfiguration=\
# 下面三个自动配置类在任何场景下都需要执行，它们是基础类
org.springframework.cloud.sleuth.annotation.SleuthAnnotationAutoConfiguration,\
org.springframework.cloud.sleuth.autoconfig.TraceAutoConfiguration,\
org.springframework.cloud.sleuth.propagation.SleuthTagPropagationAutoConfiguration,\
#下面每个自动配置类是应用于具体框架或者中间件的，比如
#TraceWebClientAutoConfiguration：对RestTemplate、WebClient等创建拦截器，当发出请求时可以对其拦截在请求的header中添加链路信息
org.springframework.cloud.sleuth.instrument.web.TraceHttpAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.web.TraceWebServletAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.web.client.TraceWebClientAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.web.client.TraceWebAsyncClientAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.async.AsyncAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.async.AsyncCustomAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.async.AsyncDefaultAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.scheduling.TraceSchedulingAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.web.client.feign.TraceFeignClientAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.hystrix.SleuthHystrixAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.circuitbreaker.SleuthCircuitBreakerAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.rxjava.RxJavaAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.reactor.TraceReactorAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.web.TraceWebFluxAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.zuul.TraceZuulAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.rpc.TraceRpcAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.grpc.TraceGrpcAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.messaging.SleuthKafkaStreamsConfiguration,\
org.springframework.cloud.sleuth.instrument.messaging.TraceMessagingAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.messaging.TraceSpringIntegrationAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.messaging.TraceSpringMessagingAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.messaging.websocket.TraceWebSocketAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.opentracing.OpentracingAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.redis.TraceRedisAutoConfiguration,\
org.springframework.cloud.sleuth.instrument.quartz.TraceQuartzAutoConfiguration
# TraceEnvironmentPostProcessor后处理器与日志打印相关
org.springframework.boot.env.EnvironmentPostProcessor=\
org.springframework.cloud.sleuth.autoconfig.TraceEnvironmentPostProcessor

2、处理日志打印

先来看一下后处理器TraceEnvironmentPostProcessor，TraceEnvironmentPostProcessor用于处理日志打印，如果应用程序不设置日志打印格式，那么该类会设置默认的打印格式，该类比较简单。

public void postProcessEnvironment(ConfigurableEnvironment environment,
            SpringApplication application) {
        Map<String, Object> map = new HashMap<String, Object>();
        // This doesn't work with all logging systems but it's a useful default so you see
        // traces in logs without having to configure it.
        //将打印的日志格式存入map中
        //日志打印一共四个内容：应用名、trace id、span id、是否发送到zipkin
        if (Boolean
                .parseBoolean(environment.getProperty("spring.sleuth.enabled", "true"))) {
            map.put("logging.pattern.level", "%5p [${spring.zipkin.service.name:"
                    + "${spring.application.name:}},%X{X-B3-TraceId:-},%X{X-B3-SpanId:-},%X{X-Span-Export:-}]");
        }
        addOrReplace(environment.getPropertySources(), map);
    }
    //下面这个方法用于将日志打印的格式设置到默认配置中
    //如果应用没有设置打印格式，则使用默认配置
    private void addOrReplace(MutablePropertySources propertySources,
            Map<String, Object> map) {
        MapPropertySource target = null;
        if (propertySources.contains(PROPERTY_SOURCE_NAME)) {
            PropertySource<?> source = propertySources.get(PROPERTY_SOURCE_NAME);
            if (source instanceof MapPropertySource) {
                target = (MapPropertySource) source;
                for (String key : map.keySet()) {
                    if (!target.containsProperty(key)) {
                        target.getSource().put(key, map.get(key));
                    }
                }
            }
        }
        if (target == null) {
            target = new MapPropertySource(PROPERTY_SOURCE_NAME, map);
        }
        if (!propertySources.contains(PROPERTY_SOURCE_NAME)) {
            propertySources.addLast(target);
        }
    }

日志打印的设置完毕后，在来看SleuthLogAutoConfiguration，该类也是与日志打印相关的。该类的注释是：

* {@link Configuration} that adds a {@link Slf4jScopeDecorator} that prints tracing information in the logs.

意思是SleuthLogAutoConfiguration将Slf4jScopeDecorator添加到配置中，可以在日志中打印追踪信息。从注释中可以看出SleuthLogAutoConfiguration只与Slf4j对接。下面来看一下代码：

@Configuration(proxyBeanMethods = false)
    @ConditionalOnClass(MDC.class)
    @EnableConfigurationProperties(SleuthSlf4jProperties.class)
    public static class Slf4jConfiguration {
        @Bean
        @ConditionalOnProperty(value = "spring.sleuth.log.slf4j.enabled",
                matchIfMissing = true)
        static CurrentTraceContext.ScopeDecorator slf4jSpanDecorator(
                SleuthProperties sleuthProperties,
                SleuthSlf4jProperties sleuthSlf4jProperties) {
            return new Slf4jScopeDecorator(sleuthProperties, sleuthSlf4jProperties);
        }
    }

slf4jSpanDecorator创建Slf4jScopeDecorator对象并放入spring容器中。
下面是Slf4jScopeDecorator的构造方法。

Slf4jScopeDecorator(SleuthProperties sleuthProperties,
            SleuthSlf4jProperties sleuthSlf4jProperties) {
        //下面四个add方法的入参便是可以在日志中打印的属性
        CorrelationScopeDecorator.Builder builder = MDCScopeDecorator.newBuilder().clear()
                .add(SingleCorrelationField.create(BaggageFields.TRACE_ID))
                .add(SingleCorrelationField.create(BaggageFields.PARENT_ID))
                .add(SingleCorrelationField.create(BaggageFields.SPAN_ID))
                .add(SingleCorrelationField.newBuilder(BaggageFields.SAMPLED)
                        .name("spanExportable").build());
        Set<String> whitelist = new TreeSet<>(String.CASE_INSENSITIVE_ORDER);
        whitelist.addAll(sleuthSlf4jProperties.getWhitelistedMdcKeys());
        //除了sleuth指定的四个属性外，应用程序还可以自定义一些参数打印或者传输到zipkin
        Set<String> retained = new LinkedHashSet<>();
        retained.addAll(sleuthProperties.getBaggageKeys());
        retained.addAll(sleuthProperties.getLocalKeys());
        retained.addAll(sleuthProperties.getPropagationKeys());
        retained.retainAll(whitelist);
        for (String name : retained) {
            builder.add(SingleCorrelationField.newBuilder(BaggageField.create(name))
                    .dirty().build());
        }
        this.delegate = builder.build();
    }

程序运行的时候还会调用该类的decorateScope方法将要打印的内容设置到MDC里面。本文后面介绍decorateScope方法。

3、TracingFilter过滤器

如果在web环境下运行时，spring boot会自动创建TracingFilter，该类创建是在TraceWebServletAutoConfiguration中完成的：

@Bean
    @ConditionalOnMissingBean
    public TracingFilter tracingFilter(HttpTracing tracing) {
        return (TracingFilter) TracingFilter.create(tracing);
    }

TracingFilter实现了javax.servlet.Filter接口，spring将TracingFilter作为web过滤器设置到web容器中，这样TracingFilter会对所有的网络请求拦截。下面看一下doFilter方法（代码有删减）：

public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain)
    throws IOException, ServletException {
    //代码删减
    //下面这行代码用于创建Span对象，如果是调用链的第一个服务，
    //则会生产trace id和span id，如果不是第一个则会读取请求报文的header信息，
    //将header的trace id和span id设置到Span对象中。
    //Span对象的实现类是RealSpan
    Span span = handler.handleReceive(new HttpServletRequestWrapper(req));
    // Add attributes for explicit access to customization or span context
    request.setAttribute(SpanCustomizer.class.getName(), span.customizer());
    request.setAttribute(TraceContext.class.getName(), span.context());
    SendHandled sendHandled = new SendHandled();
    request.setAttribute(SendHandled.class.getName(), sendHandled);
    Throwable error = null;
    //newScope方法用于将Span对象的trace id、span id等设置到sl4j的MDC中
    Scope scope = currentTraceContext.newScope(span.context());
    try {
      // any downstream code can see Tracer.currentSpan() or use Tracer.currentSpanCustomizer()
      chain.doFilter(req, res);
    } catch (Throwable e) {
      error = e;
      throw e;
    } finally {
      // When async, even if we caught an exception, we don't have the final response: defer
      if (servlet.isAsync(req)) {
        servlet.handleAsync(handler, req, res, span);
      } else if (sendHandled.compareAndSet(false, true)){
        // we have a synchronous response or error: finish the span
        HttpServerResponse responseWrapper = HttpServletResponseWrapper.create(req, res, error);
        handler.handleSend(responseWrapper, span);
      }
      scope.close();
    }
  }

在doFilter里面，创建Span对象时，如果是调用链的第一个服务，那么span id是一个long型的随机数，然后设置trace id=span id。如果不是第一个服务，则将header里面的trace id、span id、parent span id直接设置到新建的Span对象。
当将trace id、span id设置到MDC时，就要调用之前提到的Slf4jScopeDecorator.decorateScope方法。

public Scope decorateScope(TraceContext context, Scope scope) {
        return LEGACY_IDS.decorateScope(context, delegate.decorateScope(context, scope));
    }

delegate.decorateScope方法根据Slf4jScopeDecorator的构造方法的add方法添加的属性从入参context里面读取属性值，然后调用MDC.put设置到MDC中。这样打印日志的时候就可以将这些信息打印出来。
在MDC中存放的属性有：X-B3-TraceId、X-B3-SpanId、X-Span-Export、X-B3-ParentSpanId、traceId、spanId、spanExportable、parentId。
到这里为止，span的创建和日志的打印都准备完成了，进入应用程序后，我们打印的日志就都可以展示span id、trace id等信息了。

4、拦截RestTemplate

下面再来看一下sleuth如何拦截RestTemplate，将span id等信息加入到http请求的header里面。
spring boot启动时执行TraceWebClientAutoConfiguration自动配置，该类中有一个内部类TraceRestTemplateBeanPostProcessor ：

class TraceRestTemplateBeanPostProcessor implements BeanPostProcessor {
    //spring容器
    private final BeanFactory beanFactory;
    TraceRestTemplateBeanPostProcessor(BeanFactory beanFactory) {
        this.beanFactory = beanFactory;
    }
    @Override
    public Object postProcessBeforeInitialization(Object bean, String beanName)
            throws BeansException {
        return bean;
    }
    @Override
    //bean对象初始化后要执行该后处理器
    public Object postProcessAfterInitialization(Object bean, String beanName)
            throws BeansException {
        if (bean instanceof RestTemplate) {
            //如果spring容器中有RestTemplate对象，则对其进一步加工
            //inject方法见下面
            RestTemplate rt = (RestTemplate) bean;
            new RestTemplateInterceptorInjector(interceptor()).inject(rt);
        }
        return bean;
    }
    //该方法返回拦截器，该拦截器会被添加到RestTemplate中，用于对http请求拦截
    private LazyTracingClientHttpRequestInterceptor interceptor() {
        return new LazyTracingClientHttpRequestInterceptor(this.beanFactory);
    }
}

下面是RestTemplateInterceptorInjector的inject方法：

void inject(RestTemplate restTemplate) {
        if (hasTraceInterceptor(restTemplate)) {
            return;
        }
        List<ClientHttpRequestInterceptor> interceptors = new ArrayList<ClientHttpRequestInterceptor>(
                restTemplate.getInterceptors());
        interceptors.add(0, this.interceptor);
        //将拦截器设置的restTemplate对象中，添加的拦截器就是LazyTracingClientHttpRequestInterceptor
        //而且该拦截器还是第一个被调用的
        restTemplate.setInterceptors(interceptors);
    }

设置好拦截器后，当每次RestTemplate发起http请求时，都会被该拦截器拦截。
下面来看一下该拦截器如何运作的。

class LazyTracingClientHttpRequestInterceptor implements ClientHttpRequestInterceptor {
    private final BeanFactory beanFactory;
    private TracingClientHttpRequestInterceptor interceptor;
    LazyTracingClientHttpRequestInterceptor(BeanFactory beanFactory) {
        this.beanFactory = beanFactory;
    }
    @Override
    //当发起请求时，首先被方法拦截。
    public ClientHttpResponse intercept(HttpRequest request, byte[] body,
            ClientHttpRequestExecution execution) throws IOException {
        //下面代码调用TracingClientHttpRequestInterceptor的intercept方法
        return interceptor().intercept(request, body, execution);
    }
    private TracingClientHttpRequestInterceptor interceptor() {
        if (this.interceptor == null) {
            this.interceptor = this.beanFactory
                    .getBean(TracingClientHttpRequestInterceptor.class);
        }
        return this.interceptor;
    }
}

下面是TracingClientHttpRequestInterceptor的intercept方法：

@Override 
public ClientHttpResponse intercept(HttpRequest req, byte[] body,
    ClientHttpRequestExecution execution) throws IOException {
    HttpRequestWrapper request = new HttpRequestWrapper(req);
    //创建一个Span对象，其中span id重新生成，trace id不变
    //还要将span id、trace id等信息添加到http的header中
    Span span = handler.handleSend(request);
    ClientHttpResponse response = null;
    Throwable error = null;
    //下面的newScope方法会更新MDC数据，
    //也就是执行完currentTraceContext.newScop方法后，
    //MDC的span id会改变，不过在ws中记录变化前和变化后的数据，当远程服务返回后，
    //sleuth执行Multiple的close方法，将变化前的值再次设置到MDC中
    try (Scope ws = currentTraceContext.newScope(span.context())) {
        //调用远程服务
      return response = execution.execute(req, body);
    } catch (Throwable e) {
      error = e;
      throw e;
    } finally {
      handler.handleReceive(new ClientHttpResponseWrapper(request, response, error), span);
    }
  }

RestTemplate每次发起请求时，拦截器会在http请求header中放入如下信息：

x-b3-traceid = 7416922facfd03af    #表示当前调用链的trace id
x-b3-spanid = 73c6727b0a44195b     #表示span id
x-b3-parentspanid = 78bd09f345e0d7aa    #调用链中前一个服务的span id
x-b3-sampled = 1                   #表示是否取样，1表示要将调用信息发送到zipkin

当服务访问完毕后，sleuth会将之前添加到MDC的数据再清理掉。

四、链路信息抽样

我们一般将sleuth与zipkin结合使用，sleuth默认会将收集的所有信息发送到zipkin，这样是非常耗性能的，所以sleuth提供了两个参数，可以使sleuth按照一定的比例将信息发送到zipkin。

#probability表示抽样概率，如果设置为1，表示信息全部发送到zipkin，如果设置0.5，表示50%会发送
spring.sleuth.sampler.probability
#rate表示每秒收集信息的速率，对于访问量不大的请求，可以设置该参数
#比如设置rate=50，表示无论访问量大小，每秒最多发送50个信息到zipkin
spring.sleuth.sampler.rate

上述两个参数是由SamplerAutoConfiguration处理的。根据配置参数的不同，创建不同的对象：ProbabilityBasedSampler，RateLimitingSampler。

五、总结

上面分析了web环境下的sleuth执行原理。
首先sleuth创建TraceFilter，对所有的网络请求进行拦截，如果请求的header中没有span信息，则创建Span对象，生成span id、trace id等，如果header中有，则直接使用header中的数据创建Span对象，之后将span id、trace id设置到sl4j的MDC中。
当使用RestTemplate发送请求时，RestTemplateInterceptorInjector拦截器对请求拦截，将新生成的span id、trace id等信息设置到请求的header中。这样服务端收到请求后就可以从header中解析出Span信息。
其他场景的执行原理都是类似的。本文不再介绍。
我们通过日志看到的信息其实只是sleuth收集信息的一小部分，在运行过程中，sleuth还会收集服务调用时间、接收到请求的时间、发起http请求的方法、http请求的路径，包括请求的IP端口等信息，这些信息都会存入Span对象，然后发送到zipkin中。

发博词

是Spring Cloud Sleuth的原理不是zipkin的原理。

追踪原理

Spring Cloud Sleuth可以追踪10种类型的组件，async、Hystrix，messaging，websocket，rxjava，scheduling，web（Spring MVC Controller，Servlet），webclient（Spring RestTemplate）、Feign、Zuul。下面是常用的八种类型。

Scheduled

原理是AOP处理Scheduled注解
TraceSchedulingAspect可以带出，只要是在IOC容器中的Bean带有@Scheduled注解的方法的调用都会被sleuth处理。

Messaging

原理是基于spring messaging的ChannelInterceptor。
TraceChannelInterceptor/IntegrationTraceChannelInterceptor
MessagingSpanTextMapExtractor和MessagingSpanTextMapInjector

Hystrix

原理是使用HystrixPlugins添加trace相关的plugin，自定义了一个HystrixConcurrencyStrategy的实现SleuthHystrixConcurrencyStrategy
具体参考TraceCommand和SleuthHystrixConcurrencyStrategy

Feign

原理是实现了两个Feign Client实例，一个不带Ribbon TraceFeignClient、一个带Ribbon，TraceLoadBalancerFeignClient
TraceFeignAspect AOP里面的逻辑是，有地方想获取Client实例，就拦截返回自己封装的Client。

Async

@Async注解和ThreadPoolTaskExecutor下面的类
具体参看TraceAsyncAspect

RestTempate

原理是spring client的Interceptor机制。具体参看TraceRestTemplateInterceptor。

Zuul

原理是zuul的Filter机制，ZuulFilter
实现了三个TracePreZuulFilter、TracePostZuulFilter两个Filter。

示例代码

示例代码提供了上述八种组件的追踪示例，项目结构如下：

zipkin stream server
eureka server
Segment1[定时消息->消息中间件->监听消息中间件->feign+hystrix->feign+hystrix]
->Segment2[controller+async+webclient，controller2（让zuul调用）]->Segment3[zuul]

具体请查看示例代码：
github spring-cloud-sleuth-samples

注意：
zipkin stream server 的${spring.sleuth.stream.group}配置需要放到外部指定，不然不管用。
spring.kafka.consumer.group-id=xxx，内外配置都不管用
spring.cloud.stream.bindings.seluth.group=xxx ，内外配置都不管用
spring.sleuth.stream.group=xxx，在内配置不管用，在外配置管用
具体原因参看：StreamEnvironmentPostProcessor

https://blog.csdn.net/xichenguan/article/details/77448288

https://blog.csdn.net/weixin_38308374/article/details/108897599