Skip to content
清晨的一缕阳光
返回

Spring Boot SkyWalking 链路追踪

前言

SkyWalking 是国产开源的 APM(应用性能监控)系统,提供分布式链路追踪、性能指标监控、服务依赖分析等功能。本文将介绍 Spring Boot 集成 SkyWalking 的完整方案。

SkyWalking 基础

1. 架构组件

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Agent     │────▶│     OAP     │────▶│   Storage   │
│  (探针)     │     │  (服务端)   │     │  (存储)     │
└─────────────┘     └─────────────┘     └─────────────┘


                    ┌─────────────┐
                    │    UI       │
                    │  (界面)     │
                    └─────────────┘

2. 启动 SkyWalking

# 下载 SkyWalking
wget https://archive.apache.org/dist/skywalking/9.7.0/apache-skywalking-apm-9.7.0.tar.gz

# 解压
tar -xzf apache-skywalking-apm-9.7.0.tar.gz
cd apache-skywalking-apm-9.7.0

# 启动 OAP
bin/oap-service.sh start

# 启动 UI
bin/webapp-service.sh start

访问:http://localhost:8080

3. Docker 启动

# docker-compose.yml
version: '3.8'

services:
  elasticsearch:
    image: elasticsearch:8.8.0
    environment:
      - discovery.type=single-node
      - xpack.security.enabled=false
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ports:
      - "9200:9200"
    volumes:
      - es-data:/usr/share/elasticsearch/data

  oap:
    image: apache/skywalking-oap:9.7.0
    depends_on:
      - elasticsearch
    environment:
      SW_STORAGE: elasticsearch
      SW_STORAGE_ES_CLUSTER_NODES: elasticsearch:9200
    ports:
      - "11800:11800"
      - "12800:12800"

  ui:
    image: apache/skywalking-ui:9.7.0
    depends_on:
      - oap
    environment:
      SW_OAP_ADDRESS: http://oap:12800
    ports:
      - "8080:8080"

volumes:
  es-data:

集成 Spring Boot

1. 添加探针

方式一:JVM 启动参数

java -javaagent:/path/to/skywalking-agent.jar \
     -Dskywalking.agent.service_name=demo-service \
     -Dskywalking.collector.backend_service=localhost:11800 \
     -jar demo.jar

方式二:IDEA 配置

VM Options:
-javaagent:/path/to/skywalking-agent.jar
-Dskywalking.agent.service_name=demo-service
-Dskywalking.collector.backend_service=localhost:11800

2. 配置文件

# agent.config
agent:
  service_name: demo-service
  namespace: default
  
collector:
  backend_service: localhost:11800
  
logging:
  level: info
  file: /var/log/skywalking-agent.log
  
plugins:
  spring_rest_template:
    include_path_patterns:
      - /api/**

3. Maven 插件

<plugin>
    <groupId>org.apache.skywalking</groupId>
    <artifactId>sw-maven-plugin</artifactId>
    <version>9.7.0</version>
    <executions>
        <execution>
            <goals>
                <goal>copy-agent</goal>
            </goals>
            <phase>package</phase>
        </execution>
    </executions>
    <configuration>
        <agentDestination>${project.build.directory}/skywalking-agent</agentDestination>
    </configuration>
</plugin>

链路追踪

1. 自动追踪

SkyWalking 自动追踪以下组件:

2. 自定义追踪

@Component
@RequiredArgsConstructor
public class OrderService {
    
    private final OrderRepository orderRepository;
    
    /**
     * 手动添加追踪标签
     */
    @Trace
    public Order createOrder(OrderCreateDTO dto) {
        // 添加标签
        ActiveSpan.tag("order.type", dto.getType());
        ActiveSpan.tag("user.id", dto.getUserId().toString());
        
        try {
            Order order = orderRepository.save(convert(dto));
            
            // 添加日志
            ActiveSpan.log("Order created", order.getId());
            
            return order;
        } catch (Exception e) {
            // 记录异常
            ActiveSpan.error("Order creation failed", e);
            throw e;
        }
    }
    
    /**
     * 异步追踪
     */
    @Trace
    public CompletableFuture<Order> createOrderAsync(OrderCreateDTO dto) {
        return CompletableFuture.supplyAsync(() -> {
            // 传播上下文
            ContextCarrier carrier = ContextManager.createContextCarrier();
            ContextManager.extract(carrier);
            
            return ContextManager.continueTrace(carrier, () -> {
                return orderRepository.save(convert(dto));
            });
        });
    }
}

3. 链路上下文传播

@Component
public class TraceContextFilter implements Filter {
    
    @Override
    public void doFilter(
        ServletRequest request,
        ServletResponse response,
        FilterChain chain
    ) throws IOException, ServletException {
        
        HttpServletRequest httpRequest = (HttpServletRequest) request;
        
        // 提取上游链路上下文
        ContextCarrier carrier = new ContextCarrier();
        carrier.deserializeFromCarrierItem(
            httpRequest.getHeader("sw8")
        );
        
        // 继续追踪
        ContextManager.continueTrace(carrier);
        
        try {
            chain.doFilter(request, response);
        } finally {
            ContextManager.stopTrace();
        }
    }
}

性能监控

1. 指标监控

SkyWalking 自动收集以下指标:

2. 慢查询监控

# application.yml
spring:
  datasource:
    url: jdbc:mysql://localhost:3306/demo
    username: root
    password: 123456

# SkyWalking 自动监控慢 SQL
# 阈值:100ms
agent:
  slow_sql_threshold: 100

3. 自定义指标

@Component
public class CustomMetrics {
    
    /**
     * 记录业务指标
     */
    @Trace
    public void recordOrderMetrics(Order order) {
        // 订单金额
        MetricsLabel label = new MetricsLabel();
        label.append("type", order.getType());
        label.append("status", order.getStatus());
        
        MetricsBuilder builder = new MetricsBuilder();
        builder.value(order.getAmount());
        
        // 发送到 OAP
        MetricsClient.send("order_amount", label, builder);
    }
}

告警配置

1. 告警规则

# alarm-settings.yml
rules:
  # 服务响应时间慢
  - name: service_resp_time_rule
    expression: sum(service_resp_time) > 1000
    period: 10
    
  # 服务成功率低
  - name: service_sla_rule
    expression: service_sla < 90
    period: 10
    
  # 端点响应时间慢
  - name: endpoint_resp_time_rule
    expression: endpoint_resp_time > 500
    period: 10
    
  # 数据库慢查询
  - name: database_slow_query
    expression: database_resp_time > 1000
    period: 10

hooks:
  webhook:
    url: http://localhost:8080/webhook

2. Webhook 处理

@RestController
@RequestMapping("/webhook")
public class SkyWalkingWebhook {
    
    @PostMapping
    public ResponseEntity<Void> handleAlert(@RequestBody Alert alert) {
        log.warn("收到告警:{}", alert);
        
        // 发送通知
        notificationService.send(alert);
        
        return ResponseEntity.ok().build();
    }
}

@Data
public class Alert {
    private String scope;
    private String name;
    private String alarmMessage;
    private Long startTime;
}

3. 通知集成

@Service
public class NotificationService {
    
    /**
     * 发送邮件通知
     */
    public void sendEmail(Alert alert) {
        // 实现邮件发送
    }
    
    /**
     * 发送钉钉通知
     */
    public void sendDingTalk(Alert alert) {
        DingTalkClient client = new DefaultDingTalkClient(
            "https://oapi.dingtalk.com/robot/send"
        );
        
        OapiRobotSendRequest request = new OapiRobotSendRequest();
        request.setMsgtype("markdown");
        
        Markdown markdown = new Markdown();
        markdown.setTitle("SkyWalking 告警");
        markdown.setText(formatAlert(alert));
        
        request.setMarkdown(markdown);
        
        client.execute(request);
    }
    
    private String formatAlert(Alert alert) {
        return String.format(
            "## SkyWalking 告警\n\n" +
            "- **服务**: %s\n" +
            "- **告警**: %s\n" +
            "- **时间**: %s\n" +
            "- **详情**: %s",
            alert.getScope(),
            alert.getName(),
            new Date(alert.getStartTime()),
            alert.getAlarmMessage()
        );
    }
}

服务依赖分析

1. 拓扑图

SkyWalking 自动生成服务拓扑图,展示:

2. 依赖指标

3. 瓶颈分析

通过拓扑图和指标,快速定位:

最佳实践

1. 探针配置

# agent.config
agent:
  # 服务名称
  service_name: ${SW_AGENT_NAME:demo-service}
  
  # 命名空间
  namespace: ${SW_AGENT_NAMESPACE:default}
  
  # 采样率(生产环境建议降低)
  sample_n_per_3_secs: ${SW_AGENT_SAMPLE:-10}
  
  # 忽略路径
  ignore_suffix: ${SW_AGENT_IGNORE_SUFFIX:.css,.js,.html,.png}
  
collector:
  # OAP 地址
  backend_service: ${SW_AGENT_COLLECTOR_BACKEND_SERVICES:localhost:11800}
  
logging:
  # 日志级别
  level: ${SW_AGENT_LOGGING_LEVEL:info}

2. 性能优化

# 生产环境配置
agent:
  # 降低采样率
  sample_n_per_3_secs: 10
  
  # 关闭不必要的插件
  plugins:
    exclude_plugins:
      - spring-cloud-gateway
      - grpc
  
  # 异步发送
  grpc:
    async: true
    buffer_size: 1000

3. 存储优化

# OAP 配置
storage:
  elasticsearch:
    # 索引保留天数
    retention: 7
    
    # 分片数
    shards: 3
    
    # 副本数
    replicas: 1

4. 多环境隔离

# 开发环境
agent:
  namespace: dev
  service_name: demo-service-dev

# 测试环境
agent:
  namespace: test
  service_name: demo-service-test

# 生产环境
agent:
  namespace: prod
  service_name: demo-service-prod

总结

SkyWalking 链路追踪要点:

SkyWalking 是 Spring Boot 应用监控的利器。


分享这篇文章到:

上一篇文章
Sentinel 熔断降级
下一篇文章
Sentinel 基础