Skip to content

Conversation

@zstack-robot-2
Copy link
Collaborator

Resolves: ZSV-9792

Change-Id: I6b65736e646e7163777a7872667077756771726d

sync from gitlab !8697

@coderabbitai
Copy link

coderabbitai bot commented Nov 12, 2025

Walkthrough

在并行快照创建路径中新增对已成功创建快照的收集和失败回滚清理;将卷快照分组删除重构为队列驱动的按组异步清理;在快照 API 拦截器中加入对删除快照分组请求的完整性校验。

Changes

Cohort / File(s) 变更摘要
Ceph 快照创建失败清理
plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java
在并行 TakeSnapshot 流程中引入线程安全的 inventories 列表,收集成功创建的 VolumeSnapshotInventory;若有任一快照创建失败,遍历该列表并发送 VolumeSnapshotDeletionMsg 清理部分创建的快照后再返回错误路径。
外部主存储快照清理
storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.java
相同的同步列表收集/回滚逻辑被添加到外部主存储的并行快照创建实现中,失败时发送删除消息清理已创建的快照。
分组删除逻辑重构
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
将按组删除流程从直接同步删除改为两阶段、队列驱动的异步处理:先收集非空 group UUIDs,加载 VolumeSnapshotGroupVO 并为每组入列执行 ungroupAfterDeleted,仅在组内所有快照均已删除时才删除存档元数据并移除 DB 记录;增加早退优化。
API 删除校验
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java
新增对 APIDeleteVolumeSnapshotGroupMsg 的校验:加载目标分组并枚举同 VM 下的所有分组,若发现其它分组中存在未完成(有已标记为已删除但分组不完整的引用)情况,则拒绝操作并抛出 ApiMessageInterceptionException。新增私有校验方法并在拦截流程中集成。

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Factory
    participant SnapshotSvc
    participant DB

    Client->>Factory: takeSnapshots()
    Factory->>Factory: init synchronized inventories
    par Parallel snapshots
        Factory->>SnapshotSvc: createSnapshot(idA)
        SnapshotSvc-->>Factory: success (inventory A)
        Factory->>Factory: inventories.add(A)
    and
        Factory->>SnapshotSvc: createSnapshot(idB)
        SnapshotSvc-->>Factory: failure
    end

    alt all success
        Factory-->>Client: return inventories
    else some failure
        Factory->>Factory: for each inventory in inventories
        Factory->>SnapshotSvc: VolumeSnapshotDeletionMsg(inventory)
        SnapshotSvc->>DB: delete snapshot records
        Factory-->>Client: return error
    end
Loading
sequenceDiagram
    participant API as VolumeSnapshotApiInterceptor
    participant DB
    participant Base as VolumeSnapshotTreeBase
    participant Queue
    participant VIDM

    API->>DB: load target GroupVO (APIDeleteVolumeSnapshotGroupMsg)
    API->>DB: enumerate other groups for same VM
    alt some group incomplete
        API-->>Caller: throw ApiMessageInterceptionException (reject)
    else all complete
        API-->>Caller: continue
        Caller->>Base: perform group deletion
        Base->>Base: collect non-null groupUuids
        Base->>Queue: enqueue ungroupAfterDeleted per group
        loop per group
            Queue->>Base: ungroupAfterDeleted(groupVO)
            alt allSnapshotsDeleted
                Base->>VIDM: delete archive metadata
                Base->>DB: remove VolumeSnapshotGroupVO
            else
                Base-->>Queue: log & complete
            end
        end
    end
Loading

Estimated code review effort

🎯 3 (中等复杂度) | ⏱️ ~20 分钟

需要额外关注:

  • 并行快照流程中 inventories 的线程安全与可见性(synchronized 列表使用场景是否覆盖所有并发路径)。
  • 清理消息(VolumeSnapshotDeletionMsg)发送时的参数正确性(snapshot/tree/volume UIDs、scope、direction)与可能的部分失败处理。
  • VolumeSnapshotTreeBase 中队列任务的调度与异常处理路径,确保不会遗漏组记录或重复删除元数据。
  • VolumeSnapshotApiInterceptor 新增校验的边界情况(并发删除/标记流程)和错误信息的准确性。

Poem

🐰🌿
我跳过堆栈与队列,轻声数清单,
成功者留名,失败者轻梳理,
队列缓步,分组安静落定,
小胡萝卜庆祝,数据库又回晴。

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.69% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed PR标题清晰准确地描述了主要变更:当快照组创建失败时删除已成功创建的快照,与代码变更内容完全相关。
Description check ✅ Passed PR描述包含关联的issue编号(ZSV-9792)和变更ID,虽然简洁但与代码变更相关联。
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sync/tao.gan/ZSV-9792@@2

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.java (1)

540-604: 并发错误收集与可能的 NPE:请改用 While 的错误聚合,并补充快照行缺失校验

  • 多个 CloudBus 回调并发执行时直接向 errList.getCauses() 添加元素,存在线程安全风险;请使用 whileCompletion.addError(...) 并在 done(...) 使用传入的 errorCodeList 判定失败。
  • Q.New(...).find() 可能返回 null,随后 vo.getStatus() 会触发 NPE;需在使用前判空并当作失败路径处理。
  • 建议避免在清理循环里用变量名 msg 影射外层消息,提升可读性。

建议修改如下:

-        List<VolumeSnapshotInventory> inventories = Collections.synchronizedList(new ArrayList<>());
-        ErrorCodeList errList = new ErrorCodeList();
+        List<VolumeSnapshotInventory> inventories = Collections.synchronizedList(new ArrayList<>());

         new While<>(storageSnapshots).all((struct, whileCompletion) -> {
-            VolumeSnapshotVO vo = Q.New(VolumeSnapshotVO.class).eq(VolumeSnapshotVO_.uuid, struct.getResourceUuid()).find();
-            if (vo.getStatus().equals(VolumeSnapshotStatus.Ready)) {
+            VolumeSnapshotVO vo = Q.New(VolumeSnapshotVO.class)
+                    .eq(VolumeSnapshotVO_.uuid, struct.getResourceUuid()).find();
+            if (vo == null) {
+                whileCompletion.addError(operr("snapshot[%s] not found when creating on storage", struct.getResourceUuid()));
+                whileCompletion.done();
+                return;
+            }
+            if (VolumeSnapshotStatus.Ready.equals(vo.getStatus())) {
                 logger.warn(String.format("snapshot %s on volume %s is ready, no need to create again!",
                         vo.getUuid(), vo.getVolumeUuid()));
                 whileCompletion.done();
                 return;
             }
             ...
             bus.send(tmsg, new CloudBusCallBack(msg) {
                 @Override
                 public void run(MessageReply reply) {
                     if (!reply.isSuccess()) {
-                        errList.getCauses().add(reply.getError());
+                        whileCompletion.addError(reply.getError());
                         whileCompletion.done();
                         return;
                     }
                     TakeSnapshotReply treply = reply.castReply();
                     if (!treply.isSuccess()) {
-                        errList.getCauses().add(reply.getError());
+                        whileCompletion.addError(reply.getError());
                         whileCompletion.done();
                         return;
                     }
                     ...
                     inventories.add(treply.getInventory());
                     whileCompletion.done();
                 }
             });
-        }).run(new WhileDoneCompletion(completion) {
+        }).run(new WhileDoneCompletion(completion) {
             @Override
             public void done(ErrorCodeList errorCodeList) {
-                if (!errList.getCauses().isEmpty()) {
-                    completion.fail(errList.getCauses().get(0));
+                if (!errorCodeList.getCauses().isEmpty()) {
+                    completion.fail(errorCodeList.getCauses().get(0));
 
-                    inventories.forEach(snapshot -> {
-                        VolumeSnapshotDeletionMsg msg = new VolumeSnapshotDeletionMsg();
-                        msg.setSnapshotUuid(snapshot.getUuid());
-                        msg.setTreeUuid(snapshot.getTreeUuid());
-                        msg.setVolumeUuid(snapshot.getVolumeUuid());
-                        msg.setScope(DeleteVolumeSnapshotScope.Single.toString());
-                        msg.setDirection(DeleteVolumeSnapshotDirection.Commit.toString());
-                        bus.makeTargetServiceIdByResourceUuid(msg, VolumeSnapshotConstant.SERVICE_ID, snapshot.getUuid());
-                        bus.send(msg);
+                    inventories.forEach(snapshot -> {
+                        VolumeSnapshotDeletionMsg delMsg = new VolumeSnapshotDeletionMsg();
+                        delMsg.setSnapshotUuid(snapshot.getUuid());
+                        delMsg.setTreeUuid(snapshot.getTreeUuid());
+                        delMsg.setVolumeUuid(snapshot.getVolumeUuid());
+                        delMsg.setScope(DeleteVolumeSnapshotScope.Single.toString());
+                        delMsg.setDirection(DeleteVolumeSnapshotDirection.Commit.toString());
+                        bus.makeTargetServiceIdByResourceUuid(delMsg, VolumeSnapshotConstant.SERVICE_ID, snapshot.getUuid());
+                        bus.send(delMsg);
                     });
                     return;
                 }
                 completion.success();
             }
         });
plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java (1)

745-808: 与外部主存同样的问题:并发错误收集与可能的 NPE,请统一修复

  • errList 在并发回调中直接 add,非线程安全;改用 whileCompletion.addError(...) 并在 done(...) 使用 errorCodeList。
  • Q.New(...).find() 可能返回 null,随后 vo.getStatus() 触发 NPE;需判空并走失败路径。
  • 清理循环里局部变量名 msg 会遮蔽外层;建议改名 delMsg。

参考修订:

-        List<VolumeSnapshotInventory> inventories = Collections.synchronizedList(new ArrayList<>());
-        ErrorCodeList errList = new ErrorCodeList();
+        List<VolumeSnapshotInventory> inventories = Collections.synchronizedList(new ArrayList<>());
         new While<>(cephStructs).all((struct, whileCompletion) -> {
             VolumeSnapshotVO vo = Q.New(VolumeSnapshotVO.class).eq(VolumeSnapshotVO_.uuid, struct.getResourceUuid()).find();
-            if (vo.getStatus().equals(VolumeSnapshotStatus.Ready)) {
+            if (vo == null) {
+                whileCompletion.addError(operr("snapshot[%s] not found when creating on ceph", struct.getResourceUuid()));
+                whileCompletion.done();
+                return;
+            }
+            if (VolumeSnapshotStatus.Ready.equals(vo.getStatus())) {
                 ...
             }
             ...
                 public void run(MessageReply reply) {
                     if (!reply.isSuccess()) {
-                        errList.getCauses().add(reply.getError());
+                        whileCompletion.addError(reply.getError());
                         whileCompletion.done();
                         return;
                     }
                     TakeSnapshotReply treply = reply.castReply();
                     if (!treply.isSuccess()) {
-                        errList.getCauses().add(reply.getError());
+                        whileCompletion.addError(reply.getError());
                         whileCompletion.done();
                         return;
                     }
                     ...
                     inventories.add(treply.getInventory());
                     whileCompletion.done();
                 }
             });
-        }).run(new WhileDoneCompletion(completion) {
+        }).run(new WhileDoneCompletion(completion) {
             @Override
             public void done(ErrorCodeList errorCodeList) {
-                if (!errList.getCauses().isEmpty()) {
-                    completion.fail(errList.getCauses().get(0));
+                if (!errorCodeList.getCauses().isEmpty()) {
+                    completion.fail(errorCodeList.getCauses().get(0));
 
-                    inventories.forEach(snapshot -> {
-                        VolumeSnapshotDeletionMsg msg = new VolumeSnapshotDeletionMsg();
-                        msg.setSnapshotUuid(snapshot.getUuid());
-                        msg.setTreeUuid(snapshot.getTreeUuid());
-                        msg.setVolumeUuid(snapshot.getVolumeUuid());
-                        msg.setScope(DeleteVolumeSnapshotScope.Single.toString());
-                        msg.setDirection(DeleteVolumeSnapshotDirection.Commit.toString());
-                        bus.makeTargetServiceIdByResourceUuid(msg, VolumeSnapshotConstant.SERVICE_ID, snapshot.getUuid());
-                        bus.send(msg);
+                    inventories.forEach(snapshot -> {
+                        VolumeSnapshotDeletionMsg delMsg = new VolumeSnapshotDeletionMsg();
+                        delMsg.setSnapshotUuid(snapshot.getUuid());
+                        delMsg.setTreeUuid(snapshot.getTreeUuid());
+                        delMsg.setVolumeUuid(snapshot.getVolumeUuid());
+                        delMsg.setScope(DeleteVolumeSnapshotScope.Single.toString());
+                        delMsg.setDirection(DeleteVolumeSnapshotDirection.Commit.toString());
+                        bus.makeTargetServiceIdByResourceUuid(delMsg, VolumeSnapshotConstant.SERVICE_ID, snapshot.getUuid());
+                        bus.send(delMsg);
                     });
                     return;
                 }
                 completion.success();
             }
         });
🧹 Nitpick comments (3)
storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.java (1)

369-371: 布尔运算使用习惯:建议用 || 而非 |

shareable | vol.isShareable() 工作正常,但不短路且不符合常见风格;推荐改为 shareable || vol.isShareable() 提升可读性。

storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java (2)

2092-2109: 避免重复队列任务:对 groupUuids 去重

当前直接收集 groupUuid 后逐个排队,若同一组出现多次会重复入队,增加无谓开销并可能触发重复删除。建议 distinct():

-        List<String> groupUuids = snapshots.stream().map(VolumeSnapshotInventory::getGroupUuid).filter(Objects::nonNull).collect(Collectors.toList());
+        List<String> groupUuids = snapshots.stream()
+                .map(VolumeSnapshotInventory::getGroupUuid)
+                .filter(Objects::nonNull)
+                .distinct()
+                .collect(Collectors.toList());

2111-2122: 空集合防御:访问 groupVO.getVolumeSnapshotRefs() 前做空保护

部分 ORM 映射可能返回 null 列表;为避免 NPE,先转为空集合再 allMatch:

-        if (!groupVO.getVolumeSnapshotRefs().stream().allMatch(VolumeSnapshotGroupRefVO::isSnapshotDeleted)) {
+        List<VolumeSnapshotGroupRefVO> refs = Optional.ofNullable(groupVO.getVolumeSnapshotRefs())
+                .orElse(Collections.emptyList());
+        if (!refs.stream().allMatch(VolumeSnapshotGroupRefVO::isSnapshotDeleted)) {
             logger.debug(String.format("skipping ungroup operation for snapshot group[uuid:%s]: " +
                     "no group meet deletion criteria (due to remaining volume snapshots).", groupVO.getUuid()));
             completion.done();
             return;
         }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 40382ed and b031cdb.

📒 Files selected for processing (3)
  • plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java (3 hunks)
  • storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.java (3 hunks)
  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.java

⚙️ CodeRabbit configuration file

**/*.java: ## 1. API 设计要求

  • API 命名:
    • API 名称必须唯一,不能重复。
    • API 消息类需要继承 APIMessage;其返回类必须继承 APIReplyAPIEvent,并在注释中用 @RestResponse 进行标注。
    • API 消息上必须添加注解 @RestRequest,并满足如下规范:
      • path:
        • 针对资源使用复数形式。
        • 当 path 中引用消息类变量时,使用 {variableName} 格式。
      • HTTP 方法对应:
        • 查询操作 → HttpMethod.GET
        • 更新操作 → HttpMethod.PUT
        • 创建操作 → HttpMethod.POST
        • 删除操作 → HttpMethod.DELETE
    • API 类需要实现 __example__ 方法以便生成 API 文档,并确保生成对应的 Groovy API Template 与 API Markdown 文件。

2. 命名与格式规范

  • 类名:

    • 使用 UpperCamelCase 风格。
    • 特殊情况:
      • VO/AO/EO 类型类除外。
      • 抽象类采用 AbstractBase 前缀/后缀。
      • 异常类应以 Exception 结尾。
      • 测试类需要以 TestCase 结尾。
  • 方法名、参数名、成员变量和局部变量:

    • 使用 lowerCamelCase 风格。
  • 常量命名:

    • 全部大写,使用下划线分隔单词。
    • 要求表达清楚,避免使用含糊或不准确的名称。
  • 包名:

    • 统一使用小写,使用点分隔符,每个部分应是一个具有自然语义的英文单词(参考 Spring 框架的结构)。
  • 命名细节:

    • 避免在父子类或同一代码块中出现相同名字的成员或局部变量,防止混淆。
    • 命名缩写:
      • 不允许使用不必要的缩写,如:AbsSchedulerJobcondiFu 等。应使用完整单词提升可读性。

3. 编写自解释代码

  • 意图表达:

    • 避免使用布尔型参数造成含义不明确。例如:
      • 对于 stopAgent(boolean ignoreError),建议拆分为不同函数(如 stopAgentIgnoreError()),或使用枚举表达操作类型。
    • 命名应尽量用完整的单词组合表达意图,并在名称中体现数据类型或用途(例如在常量与变量名称中,将类型词放在末尾)。
  • 注释:

    • 代码应尽量做到自解释,对少于两行的说明可以直接写在代码中。
    • 对于较长的注释,需要仔细校对并随代码更新,确保内容正确。
    • 接口方法不应有多余的修饰符(例如 public),且必须配有有效的 Javadoc 注释。

4. 流程控制和结构优化

  • if...else 的使用:

    • 应尽量减少 if...else 结构的使用,建议:
      • 限制嵌套层级最多为两层,且内层不应再出现 else 分支。
      • 尽早返回(Early Return),将条件判断中的处理逻辑提前结束或抽成独立方法。
      • 使用 Java Stream 或 Lambda 表达式代替冗长的循环与条件判断。
  • 条件判断:

    • if 条件表达不宜过长或过于复杂,必要时可以将条件抽成 boolean 变量描述。
  • 代码块长度:

    • 单个 if 代码块不宜超过一屏显示,以提高可读性和后续维护性。

5. 异常处理与日志

  • 捕获异常的原则:
    • 对于可以通过预检查避免的 RuntimeException(如 NullPointerExceptionIndexOutOfBoundsException 等),不建议使用 try-catch 来进行处理。
    • 捕获异常应仅用于处理真正的意外情况,不应将异常逻辑当作正常流程控制。
    • 在必要时,应继续抛出异常,使上层业务处理者可以转换为用户友好的错误提示。
    • 使用 try-with-resources 语法管理资源,确保在 finally 块中正确关闭资源,并避免在 finally 中返回值。
      ...

Files:

  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
  • storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.java
  • plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java
🧠 Learnings (8)
📓 Common learnings
Learnt from: ZStack-Robot
Repo: MatheMatrix/zstack PR: 2496
File: storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java:1218-1224
Timestamp: 2025-08-24T06:33:10.771Z
Learning: ZStack团队在容量管理相关问题上响应迅速,当发现Pull快照流程中申请容量与释放容量不匹配的问题时,开发人员会及时进行修复以确保主存储容量核算的准确性。
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2360
File: network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java:449-490
Timestamp: 2025-08-04T04:48:19.103Z
Learning: ZStack项目在cherry-pick操作中,即使发现了性能优化机会(如IP地址批量保存的内存优化),也严格遵循不做额外修改的政策,优先保证cherry-pick的完整性和一致性。
📚 Learning: 2025-08-25T03:52:37.301Z
Learnt from: ZStack-Robot
Repo: MatheMatrix/zstack PR: 2504
File: storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java:775-819
Timestamp: 2025-08-25T03:52:37.301Z
Learning: 在ZStack项目的VolumeSnapshotTreeBase类中,当通过dbf.findByUuid()方法获取VolumeVO对象时,需要进行null检查,因为该方法可能在找不到对应记录时返回null,直接使用可能导致NullPointerException。

Applied to files:

  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
📚 Learning: 2025-08-22T06:31:57.406Z
Learnt from: ZStack-Robot
Repo: MatheMatrix/zstack PR: 2489
File: storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java:179-179
Timestamp: 2025-08-22T06:31:57.406Z
Learning: 在ZStack项目的VolumeSnapshotAO类中,团队决定不为treeUuid字段添加Index注解,即使这可能会影响查询性能。团队优先考虑代码稳定性而非性能优化。

Applied to files:

  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
📚 Learning: 2024-06-10T19:31:27.994Z
Learnt from: AlanJager
Repo: MatheMatrix/zstack PR: 175
File: storage/src/main/java/org/zstack/storage/volume/VolumeBase.java:31-38
Timestamp: 2024-06-10T19:31:27.994Z
Learning: The user has clarified that the `MemorySnapshotGroupExtensionPoint` has been removed and its implementation has been moved to `VolumeSnapshotCreationExtensionPoint`.

Applied to files:

  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
📚 Learning: 2025-08-14T06:56:19.585Z
Learnt from: zstack-robot-2
Repo: MatheMatrix/zstack PR: 2435
File: storage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.java:47-47
Timestamp: 2025-08-14T06:56:19.585Z
Learning: 在VolumeSnapshotGroupBase.java中,VmInstanceResourceMetadataManager的注入和SKIP_RESOURCE_ROLLBACK标记虽然在当前版本中未被使用,但这些导入在大型重构PR中是有意为之的,用于保持代码一致性或为后续功能做准备。

Applied to files:

  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
  • storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.java
  • plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java
📚 Learning: 2025-08-24T07:56:17.047Z
Learnt from: MatheMatrix
Repo: MatheMatrix/zstack PR: 2496
File: storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotCascadeExtension.java:71-73
Timestamp: 2025-08-24T07:56:17.047Z
Learning: In VolumeSnapshotCascadeExtension, the createActionForChildResource method always wraps the context in VolumeSnapshotDeletionStructs when creating child actions, so no backward compatibility with List<VolumeSnapshotInventory> is needed in handleDeletionCleanup.

Applied to files:

  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
  • storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.java
  • plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java
📚 Learning: 2025-08-14T06:48:00.549Z
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2435
File: header/src/main/java/org/zstack/header/storage/snapshot/group/APICheckMemorySnapshotGroupConflictMsg.java:3-7
Timestamp: 2025-08-14T06:48:00.549Z
Learning: In the ZStack project, API message classes may not require explicit imports for VolumeSnapshotGroupVO references in APIParam annotations or for the uuid() helper method used in __example__() methods, even when these classes/methods are defined in other packages. The build system or framework appears to handle these references automatically.

Applied to files:

  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
📚 Learning: 2025-08-25T03:55:07.988Z
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2504
File: plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackend.java:2046-2078
Timestamp: 2025-08-25T03:55:07.988Z
Learning: The OFFLINE_SNAPSHOT_COMMIT path for NFS primary storage snapshot commit operations is properly handled in the test infrastructure via NfsPrimaryStorageSpec.groovy, which includes both simulator and VFS hook implementations for testing the offline snapshot commit functionality.

Applied to files:

  • storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.java
  • plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java
🧬 Code graph analysis (1)
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java (1)
core/src/main/java/org/zstack/core/db/Q.java (1)
  • Q (16-193)

…ccessfully created snapshots

Resolves: ZSV-9792

Change-Id: I6b65736e646e7163777a7872667077756771726d
@MatheMatrix MatheMatrix force-pushed the sync/tao.gan/ZSV-9792@@2 branch from b031cdb to 632d824 Compare November 13, 2025 01:36
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java (1)

224-247: 建议重构以提升代码可读性和遵循编码规范

当前实现逻辑正确,但存在以下可改进之处:

  1. 避免使用数组作为可变持有者(Line 235):final String[] ungroupUuid = new String[1] 是 Java 中的反模式。建议使用更清晰的方式,如提前返回或使用 AtomicReference

  2. 简化嵌套循环(Lines 236-243):嵌套的 forEach 使代码可读性下降。建议使用 Stream API 的 anyMatch 或提前 return/break 来简化逻辑。

  3. 符合编码规范中的流程控制要求:根据编码指南,应尽量减少嵌套层级,优先使用 Early Return 模式。

可考虑如下重构:

 private void validate(APIDeleteVolumeSnapshotGroupMsg msg) {
     VolumeSnapshotGroupVO groupVO = dbf.findByUuid(msg.getUuid(), VolumeSnapshotGroupVO.class);
-    // 获取当前虚拟机所有内存快照
-    // 检测内存快照是否完整
-    // 1 完整 允许删除
-    // 2 不完整 不允许删除
+    
     List<VolumeSnapshotGroupVO> groups = Q.New(VolumeSnapshotGroupVO.class)
             .eq(VolumeSnapshotGroupVO_.vmInstanceUuid, groupVO.getVmInstanceUuid())
             .orderByAsc(VolumeSnapshotGroupVO_.createDate)
             .list();
 
-    final String[] ungroupUuid = new String[1];
-    groups.forEach(group -> {
-        Set<VolumeSnapshotGroupRefVO> volumeSnapshotRefs = group.getVolumeSnapshotRefs();
-        volumeSnapshotRefs.forEach(ref -> {
-            if (ref.isSnapshotDeleted()) {
-                ungroupUuid[0] = group.getUuid();
-            }
-        });
-    });
-    if (ungroupUuid[0] != null) {
-        throw new ApiMessageInterceptionException(argerr("volume snapshot group[uuid:%s] is not complete, cannot delete volume snapshot group", ungroupUuid[0]));
+    for (VolumeSnapshotGroupVO group : groups) {
+        boolean hasDeletedSnapshot = group.getVolumeSnapshotRefs().stream()
+                .anyMatch(VolumeSnapshotGroupRefVO::isSnapshotDeleted);
+        if (hasDeletedSnapshot) {
+            throw new ApiMessageInterceptionException(argerr(
+                "volume snapshot group[uuid:%s] is not complete, cannot delete volume snapshot group", 
+                group.getUuid()));
+        }
     }
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b031cdb and 632d824.

📒 Files selected for processing (4)
  • plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java (3 hunks)
  • storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.java (3 hunks)
  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java (3 hunks)
  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.java

⚙️ CodeRabbit configuration file

**/*.java: ## 1. API 设计要求

  • API 命名:
    • API 名称必须唯一,不能重复。
    • API 消息类需要继承 APIMessage;其返回类必须继承 APIReplyAPIEvent,并在注释中用 @RestResponse 进行标注。
    • API 消息上必须添加注解 @RestRequest,并满足如下规范:
      • path:
        • 针对资源使用复数形式。
        • 当 path 中引用消息类变量时,使用 {variableName} 格式。
      • HTTP 方法对应:
        • 查询操作 → HttpMethod.GET
        • 更新操作 → HttpMethod.PUT
        • 创建操作 → HttpMethod.POST
        • 删除操作 → HttpMethod.DELETE
    • API 类需要实现 __example__ 方法以便生成 API 文档,并确保生成对应的 Groovy API Template 与 API Markdown 文件。

2. 命名与格式规范

  • 类名:

    • 使用 UpperCamelCase 风格。
    • 特殊情况:
      • VO/AO/EO 类型类除外。
      • 抽象类采用 AbstractBase 前缀/后缀。
      • 异常类应以 Exception 结尾。
      • 测试类需要以 TestCase 结尾。
  • 方法名、参数名、成员变量和局部变量:

    • 使用 lowerCamelCase 风格。
  • 常量命名:

    • 全部大写,使用下划线分隔单词。
    • 要求表达清楚,避免使用含糊或不准确的名称。
  • 包名:

    • 统一使用小写,使用点分隔符,每个部分应是一个具有自然语义的英文单词(参考 Spring 框架的结构)。
  • 命名细节:

    • 避免在父子类或同一代码块中出现相同名字的成员或局部变量,防止混淆。
    • 命名缩写:
      • 不允许使用不必要的缩写,如:AbsSchedulerJobcondiFu 等。应使用完整单词提升可读性。

3. 编写自解释代码

  • 意图表达:

    • 避免使用布尔型参数造成含义不明确。例如:
      • 对于 stopAgent(boolean ignoreError),建议拆分为不同函数(如 stopAgentIgnoreError()),或使用枚举表达操作类型。
    • 命名应尽量用完整的单词组合表达意图,并在名称中体现数据类型或用途(例如在常量与变量名称中,将类型词放在末尾)。
  • 注释:

    • 代码应尽量做到自解释,对少于两行的说明可以直接写在代码中。
    • 对于较长的注释,需要仔细校对并随代码更新,确保内容正确。
    • 接口方法不应有多余的修饰符(例如 public),且必须配有有效的 Javadoc 注释。

4. 流程控制和结构优化

  • if...else 的使用:

    • 应尽量减少 if...else 结构的使用,建议:
      • 限制嵌套层级最多为两层,且内层不应再出现 else 分支。
      • 尽早返回(Early Return),将条件判断中的处理逻辑提前结束或抽成独立方法。
      • 使用 Java Stream 或 Lambda 表达式代替冗长的循环与条件判断。
  • 条件判断:

    • if 条件表达不宜过长或过于复杂,必要时可以将条件抽成 boolean 变量描述。
  • 代码块长度:

    • 单个 if 代码块不宜超过一屏显示,以提高可读性和后续维护性。

5. 异常处理与日志

  • 捕获异常的原则:
    • 对于可以通过预检查避免的 RuntimeException(如 NullPointerExceptionIndexOutOfBoundsException 等),不建议使用 try-catch 来进行处理。
    • 捕获异常应仅用于处理真正的意外情况,不应将异常逻辑当作正常流程控制。
    • 在必要时,应继续抛出异常,使上层业务处理者可以转换为用户友好的错误提示。
    • 使用 try-with-resources 语法管理资源,确保在 finally 块中正确关闭资源,并避免在 finally 中返回值。
      ...

Files:

  • storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.java
  • plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java
  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java
🧠 Learnings (11)
📓 Common learnings
Learnt from: ZStack-Robot
Repo: MatheMatrix/zstack PR: 2496
File: storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java:1218-1224
Timestamp: 2025-08-24T06:33:10.771Z
Learning: ZStack团队在容量管理相关问题上响应迅速,当发现Pull快照流程中申请容量与释放容量不匹配的问题时,开发人员会及时进行修复以确保主存储容量核算的准确性。
Learnt from: zstack-robot-2
Repo: MatheMatrix/zstack PR: 2435
File: storage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.java:47-47
Timestamp: 2025-08-14T06:56:19.585Z
Learning: 在VolumeSnapshotGroupBase.java中,VmInstanceResourceMetadataManager的注入和SKIP_RESOURCE_ROLLBACK标记虽然在当前版本中未被使用,但这些导入在大型重构PR中是有意为之的,用于保持代码一致性或为后续功能做准备。
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2360
File: network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java:449-490
Timestamp: 2025-08-04T04:48:19.103Z
Learning: ZStack项目在cherry-pick操作中,即使发现了性能优化机会(如IP地址批量保存的内存优化),也严格遵循不做额外修改的政策,优先保证cherry-pick的完整性和一致性。
📚 Learning: 2025-08-25T03:55:07.988Z
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2504
File: plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackend.java:2046-2078
Timestamp: 2025-08-25T03:55:07.988Z
Learning: The OFFLINE_SNAPSHOT_COMMIT path for NFS primary storage snapshot commit operations is properly handled in the test infrastructure via NfsPrimaryStorageSpec.groovy, which includes both simulator and VFS hook implementations for testing the offline snapshot commit functionality.

Applied to files:

  • storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.java
  • plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java
📚 Learning: 2025-08-24T07:56:17.047Z
Learnt from: MatheMatrix
Repo: MatheMatrix/zstack PR: 2496
File: storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotCascadeExtension.java:71-73
Timestamp: 2025-08-24T07:56:17.047Z
Learning: In VolumeSnapshotCascadeExtension, the createActionForChildResource method always wraps the context in VolumeSnapshotDeletionStructs when creating child actions, so no backward compatibility with List<VolumeSnapshotInventory> is needed in handleDeletionCleanup.

Applied to files:

  • storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.java
  • plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java
  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java
📚 Learning: 2025-08-14T06:56:19.585Z
Learnt from: zstack-robot-2
Repo: MatheMatrix/zstack PR: 2435
File: storage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.java:47-47
Timestamp: 2025-08-14T06:56:19.585Z
Learning: 在VolumeSnapshotGroupBase.java中,VmInstanceResourceMetadataManager的注入和SKIP_RESOURCE_ROLLBACK标记虽然在当前版本中未被使用,但这些导入在大型重构PR中是有意为之的,用于保持代码一致性或为后续功能做准备。

Applied to files:

  • plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java
  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java
📚 Learning: 2024-06-10T19:31:27.994Z
Learnt from: AlanJager
Repo: MatheMatrix/zstack PR: 175
File: storage/src/main/java/org/zstack/storage/volume/VolumeBase.java:31-38
Timestamp: 2024-06-10T19:31:27.994Z
Learning: The user has clarified that the `MemorySnapshotGroupExtensionPoint` has been removed and its implementation has been moved to `VolumeSnapshotCreationExtensionPoint`.

Applied to files:

  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
📚 Learning: 2025-08-25T03:52:37.301Z
Learnt from: ZStack-Robot
Repo: MatheMatrix/zstack PR: 2504
File: storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java:775-819
Timestamp: 2025-08-25T03:52:37.301Z
Learning: 在ZStack项目的VolumeSnapshotTreeBase类中,当通过dbf.findByUuid()方法获取VolumeVO对象时,需要进行null检查,因为该方法可能在找不到对应记录时返回null,直接使用可能导致NullPointerException。

Applied to files:

  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java
📚 Learning: 2025-08-22T06:31:57.406Z
Learnt from: ZStack-Robot
Repo: MatheMatrix/zstack PR: 2489
File: storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java:179-179
Timestamp: 2025-08-22T06:31:57.406Z
Learning: 在ZStack项目的VolumeSnapshotAO类中,团队决定不为treeUuid字段添加Index注解,即使这可能会影响查询性能。团队优先考虑代码稳定性而非性能优化。

Applied to files:

  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
📚 Learning: 2025-08-14T06:48:00.549Z
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2435
File: header/src/main/java/org/zstack/header/storage/snapshot/group/APICheckMemorySnapshotGroupConflictMsg.java:3-7
Timestamp: 2025-08-14T06:48:00.549Z
Learning: In the ZStack project, API message classes may not require explicit imports for VolumeSnapshotGroupVO references in APIParam annotations or for the uuid() helper method used in __example__() methods, even when these classes/methods are defined in other packages. The build system or framework appears to handle these references automatically.

Applied to files:

  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java
📚 Learning: 2025-06-19T10:34:39.243Z
Learnt from: MatheMatrix
Repo: MatheMatrix/zstack PR: 2199
File: plugin/securityGroup/src/main/java/org/zstack/network/securitygroup/SecurityGroupApiInterceptor.java:745-745
Timestamp: 2025-06-19T10:34:39.243Z
Learning: 在 SecurityGroupApiInterceptor.java 中,所有来自 API Message 的外部字符串参数(如 IP 范围、端口范围、安全组 UUID、描述等)都应当在 validate 方法中进行 trim 处理,以防止用户在浏览器中复制粘贴带有空格、换行符、回车符等不可见字符的数据影响验证逻辑。

Applied to files:

  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java
📚 Learning: 2025-06-19T10:34:39.243Z
Learnt from: MatheMatrix
Repo: MatheMatrix/zstack PR: 2199
File: plugin/securityGroup/src/main/java/org/zstack/network/securitygroup/SecurityGroupApiInterceptor.java:745-745
Timestamp: 2025-06-19T10:34:39.243Z
Learning: SecurityGroupApiInterceptor.java 中需要对以下类型的外部字符串参数进行 trim 处理:1) IP相关字段(getAllowedCidr, getSrcIpRange, getDstIpRange) 2) 端口字段(getDstPortRange) 3) UUID字段(getRemoteSecurityGroupUuid等) 4) 描述字段(getDescription) 5) 枚举字符串字段(getProtocol, getAction, getState, getType)。这些参数在进入 validateIps, validatePorts 以及所有 validate 方法之前都应当被 trim,以防止空格、换行符等不可见字符影响验证逻辑。

Applied to files:

  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java
📚 Learning: 2025-04-21T03:20:20.087Z
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2024
File: compute/src/main/java/org/zstack/compute/vm/VmInstanceApiInterceptor.java:373-388
Timestamp: 2025-04-21T03:20:20.087Z
Learning: 在 ZStack 中,验证逻辑应该抛出 `ApiMessageInterceptionException` 而不是 `OperationFailureException`,以保持异常处理的一致性。这特别适用于 VM 系统的参数验证过程。

Applied to files:

  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java
🧬 Code graph analysis (2)
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java (1)
core/src/main/java/org/zstack/core/db/Q.java (1)
  • Q (16-193)
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java (1)
core/src/main/java/org/zstack/core/db/Q.java (1)
  • Q (16-193)
🔇 Additional comments (8)
storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.java (2)

540-540: LGTM!快照库存收集机制设计合理

使用同步列表收集并行创建的快照,为失败场景下的回滚清理提供了必要的上下文信息。这与 PR 目标中"快照组创建失败时删除已成功创建的快照"的需求相符。


587-598: 回滚清理逻辑正确,符合错误处理最佳实践

在任一快照创建失败时,遍历已成功创建的快照并发送删除消息进行清理。使用"fire-and-forget"模式发送清理消息是合理的,因为这是尽力而为的清理操作。删除消息的参数设置(Scope=Single, Direction=Commit)符合单个快照清理的语义。

storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java (2)

34-34: 导入正确

新增的 Set 导入用于新增的验证逻辑中,符合需求。


78-79: 消息路由正确

APIDeleteVolumeSnapshotGroupMsg 添加验证路由,与现有的消息拦截模式保持一致。

storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java (2)

2087-2109: 异步队列化重构设计优秀

将快照组取消分组操作重构为基于队列的异步处理模式,具有以下优点:

  1. 提前退出优化(Lines 2092-2095):在没有组 UUID 时立即返回,避免不必要的处理。
  2. 每组独立任务(Lines 2098-2109):每个快照组使用独立的队列任务(ungroup-volumeSnapshotGroup-<uuid>),提升并发处理能力。
  3. 与 PR 整体设计一致:配合 API 拦截器中的完整性校验和主存储的回滚清理,形成完整的快照组生命周期管理。

这种重构将同步的直接删除替换为异步的按组处理,提高了系统的可扩展性和健壮性。


2111-2122: 每组清理逻辑正确,遵循最佳实践

新增的 ungroupAfterDeleted(VolumeSnapshotGroupVO groupVO, NoErrorCompletion completion) 方法实现合理:

  1. 条件检查清晰(Line 2112):使用 allMatch() 判断组内所有快照是否已删除,代码简洁易懂。
  2. 提前返回模式(Lines 2113-2116):遵循编码规范中的 Early Return 原则,在不满足删除条件时立即返回。
  3. 完整的清理流程(Lines 2118-2121):先删除归档元数据,再删除数据库记录,顺序合理。

该方法与重构后的队列化调用配合良好,确保了快照组的正确清理。

plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java (2)

745-745: 线程安全的快照收集列表初始化正确

使用 Collections.synchronizedList 确保了并行快照创建过程中的线程安全。


783-783: 成功快照收集逻辑正确

在快照创建成功后正确地将快照信息添加到收集列表中,为后续失败时的清理操作提供依据。

Comment on lines +792 to +802

inventories.forEach(snapshot -> {
VolumeSnapshotDeletionMsg msg = new VolumeSnapshotDeletionMsg();
msg.setSnapshotUuid(snapshot.getUuid());
msg.setTreeUuid(snapshot.getTreeUuid());
msg.setVolumeUuid(snapshot.getVolumeUuid());
msg.setScope(DeleteVolumeSnapshotScope.Single.toString());
msg.setDirection(DeleteVolumeSnapshotDirection.Commit.toString());
bus.makeTargetServiceIdByResourceUuid(msg, VolumeSnapshotConstant.SERVICE_ID, snapshot.getUuid());
bus.send(msg);
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

建议为快照清理操作添加错误处理和日志记录

当前的清理逻辑采用"发送后不管"(fire-and-forget)方式,存在以下风险:

  1. 无法验证快照删除是否成功
  2. 如果删除失败,可能导致孤立快照残留
  3. 缺少日志记录,难以追踪清理操作的执行情况

建议改进:

                 if (!errList.getCauses().isEmpty()) {
                     completion.fail(errList.getCauses().get(0));
 
+                    logger.warn(String.format("snapshot creation failed, cleaning up %d successfully created snapshots", inventories.size()));
                     inventories.forEach(snapshot -> {
                         VolumeSnapshotDeletionMsg msg = new VolumeSnapshotDeletionMsg();
                         msg.setSnapshotUuid(snapshot.getUuid());
                         msg.setTreeUuid(snapshot.getTreeUuid());
                         msg.setVolumeUuid(snapshot.getVolumeUuid());
                         msg.setScope(DeleteVolumeSnapshotScope.Single.toString());
                         msg.setDirection(DeleteVolumeSnapshotDirection.Commit.toString());
                         bus.makeTargetServiceIdByResourceUuid(msg, VolumeSnapshotConstant.SERVICE_ID, snapshot.getUuid());
-                        bus.send(msg);
+                        bus.send(msg, new CloudBusCallBack(null) {
+                            @Override
+                            public void run(MessageReply reply) {
+                                if (!reply.isSuccess()) {
+                                    logger.warn(String.format("failed to delete snapshot[uuid:%s] during cleanup: %s", 
+                                        snapshot.getUuid(), reply.getError()));
+                                } else {
+                                    logger.debug(String.format("successfully deleted snapshot[uuid:%s] during cleanup", 
+                                        snapshot.getUuid()));
+                                }
+                            }
+                        });
                     });
                     return;
                 }

这样可以:

  • 记录清理操作的执行情况
  • 捕获并记录删除失败的错误
  • 提高系统的可观测性和可维护性

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In
plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java
around lines 792 to 802, the snapshot deletion operation uses a fire-and-forget
approach by calling bus.send(msg) without handling responses or adding logging,
which makes it impossible to verify deletion success or diagnose failures.
Modify this code to add error handling and logging: replace the simple
bus.send(msg) call with bus.send(msg, new AbstractMessageListener() {...}) to
add a response handler that logs successful deletions and captures any errors
that occur, ensuring you also add a log statement before sending to record that
the deletion operation was initiated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants