-
Notifications
You must be signed in to change notification settings - Fork 0
<fix>[storage]: when the snapshot group creation fails, delete the successfully created snapshots #2899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: zsv_4.10.20
Are you sure you want to change the base?
Conversation
Walkthrough在并行快照创建路径中新增对已成功创建快照的收集和失败回滚清理;将卷快照分组删除重构为队列驱动的按组异步清理;在快照 API 拦截器中加入对删除快照分组请求的完整性校验。 Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Factory
participant SnapshotSvc
participant DB
Client->>Factory: takeSnapshots()
Factory->>Factory: init synchronized inventories
par Parallel snapshots
Factory->>SnapshotSvc: createSnapshot(idA)
SnapshotSvc-->>Factory: success (inventory A)
Factory->>Factory: inventories.add(A)
and
Factory->>SnapshotSvc: createSnapshot(idB)
SnapshotSvc-->>Factory: failure
end
alt all success
Factory-->>Client: return inventories
else some failure
Factory->>Factory: for each inventory in inventories
Factory->>SnapshotSvc: VolumeSnapshotDeletionMsg(inventory)
SnapshotSvc->>DB: delete snapshot records
Factory-->>Client: return error
end
sequenceDiagram
participant API as VolumeSnapshotApiInterceptor
participant DB
participant Base as VolumeSnapshotTreeBase
participant Queue
participant VIDM
API->>DB: load target GroupVO (APIDeleteVolumeSnapshotGroupMsg)
API->>DB: enumerate other groups for same VM
alt some group incomplete
API-->>Caller: throw ApiMessageInterceptionException (reject)
else all complete
API-->>Caller: continue
Caller->>Base: perform group deletion
Base->>Base: collect non-null groupUuids
Base->>Queue: enqueue ungroupAfterDeleted per group
loop per group
Queue->>Base: ungroupAfterDeleted(groupVO)
alt allSnapshotsDeleted
Base->>VIDM: delete archive metadata
Base->>DB: remove VolumeSnapshotGroupVO
else
Base-->>Queue: log & complete
end
end
end
Estimated code review effort🎯 3 (中等复杂度) | ⏱️ ~20 分钟 需要额外关注:
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.java (1)
540-604: 并发错误收集与可能的 NPE:请改用 While 的错误聚合,并补充快照行缺失校验
- 多个 CloudBus 回调并发执行时直接向 errList.getCauses() 添加元素,存在线程安全风险;请使用 whileCompletion.addError(...) 并在 done(...) 使用传入的 errorCodeList 判定失败。
- Q.New(...).find() 可能返回 null,随后 vo.getStatus() 会触发 NPE;需在使用前判空并当作失败路径处理。
- 建议避免在清理循环里用变量名 msg 影射外层消息,提升可读性。
建议修改如下:
- List<VolumeSnapshotInventory> inventories = Collections.synchronizedList(new ArrayList<>()); - ErrorCodeList errList = new ErrorCodeList(); + List<VolumeSnapshotInventory> inventories = Collections.synchronizedList(new ArrayList<>()); new While<>(storageSnapshots).all((struct, whileCompletion) -> { - VolumeSnapshotVO vo = Q.New(VolumeSnapshotVO.class).eq(VolumeSnapshotVO_.uuid, struct.getResourceUuid()).find(); - if (vo.getStatus().equals(VolumeSnapshotStatus.Ready)) { + VolumeSnapshotVO vo = Q.New(VolumeSnapshotVO.class) + .eq(VolumeSnapshotVO_.uuid, struct.getResourceUuid()).find(); + if (vo == null) { + whileCompletion.addError(operr("snapshot[%s] not found when creating on storage", struct.getResourceUuid())); + whileCompletion.done(); + return; + } + if (VolumeSnapshotStatus.Ready.equals(vo.getStatus())) { logger.warn(String.format("snapshot %s on volume %s is ready, no need to create again!", vo.getUuid(), vo.getVolumeUuid())); whileCompletion.done(); return; } ... bus.send(tmsg, new CloudBusCallBack(msg) { @Override public void run(MessageReply reply) { if (!reply.isSuccess()) { - errList.getCauses().add(reply.getError()); + whileCompletion.addError(reply.getError()); whileCompletion.done(); return; } TakeSnapshotReply treply = reply.castReply(); if (!treply.isSuccess()) { - errList.getCauses().add(reply.getError()); + whileCompletion.addError(reply.getError()); whileCompletion.done(); return; } ... inventories.add(treply.getInventory()); whileCompletion.done(); } }); - }).run(new WhileDoneCompletion(completion) { + }).run(new WhileDoneCompletion(completion) { @Override public void done(ErrorCodeList errorCodeList) { - if (!errList.getCauses().isEmpty()) { - completion.fail(errList.getCauses().get(0)); + if (!errorCodeList.getCauses().isEmpty()) { + completion.fail(errorCodeList.getCauses().get(0)); - inventories.forEach(snapshot -> { - VolumeSnapshotDeletionMsg msg = new VolumeSnapshotDeletionMsg(); - msg.setSnapshotUuid(snapshot.getUuid()); - msg.setTreeUuid(snapshot.getTreeUuid()); - msg.setVolumeUuid(snapshot.getVolumeUuid()); - msg.setScope(DeleteVolumeSnapshotScope.Single.toString()); - msg.setDirection(DeleteVolumeSnapshotDirection.Commit.toString()); - bus.makeTargetServiceIdByResourceUuid(msg, VolumeSnapshotConstant.SERVICE_ID, snapshot.getUuid()); - bus.send(msg); + inventories.forEach(snapshot -> { + VolumeSnapshotDeletionMsg delMsg = new VolumeSnapshotDeletionMsg(); + delMsg.setSnapshotUuid(snapshot.getUuid()); + delMsg.setTreeUuid(snapshot.getTreeUuid()); + delMsg.setVolumeUuid(snapshot.getVolumeUuid()); + delMsg.setScope(DeleteVolumeSnapshotScope.Single.toString()); + delMsg.setDirection(DeleteVolumeSnapshotDirection.Commit.toString()); + bus.makeTargetServiceIdByResourceUuid(delMsg, VolumeSnapshotConstant.SERVICE_ID, snapshot.getUuid()); + bus.send(delMsg); }); return; } completion.success(); } });plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java (1)
745-808: 与外部主存同样的问题:并发错误收集与可能的 NPE,请统一修复
- errList 在并发回调中直接 add,非线程安全;改用 whileCompletion.addError(...) 并在 done(...) 使用 errorCodeList。
- Q.New(...).find() 可能返回 null,随后 vo.getStatus() 触发 NPE;需判空并走失败路径。
- 清理循环里局部变量名 msg 会遮蔽外层;建议改名 delMsg。
参考修订:
- List<VolumeSnapshotInventory> inventories = Collections.synchronizedList(new ArrayList<>()); - ErrorCodeList errList = new ErrorCodeList(); + List<VolumeSnapshotInventory> inventories = Collections.synchronizedList(new ArrayList<>()); new While<>(cephStructs).all((struct, whileCompletion) -> { VolumeSnapshotVO vo = Q.New(VolumeSnapshotVO.class).eq(VolumeSnapshotVO_.uuid, struct.getResourceUuid()).find(); - if (vo.getStatus().equals(VolumeSnapshotStatus.Ready)) { + if (vo == null) { + whileCompletion.addError(operr("snapshot[%s] not found when creating on ceph", struct.getResourceUuid())); + whileCompletion.done(); + return; + } + if (VolumeSnapshotStatus.Ready.equals(vo.getStatus())) { ... } ... public void run(MessageReply reply) { if (!reply.isSuccess()) { - errList.getCauses().add(reply.getError()); + whileCompletion.addError(reply.getError()); whileCompletion.done(); return; } TakeSnapshotReply treply = reply.castReply(); if (!treply.isSuccess()) { - errList.getCauses().add(reply.getError()); + whileCompletion.addError(reply.getError()); whileCompletion.done(); return; } ... inventories.add(treply.getInventory()); whileCompletion.done(); } }); - }).run(new WhileDoneCompletion(completion) { + }).run(new WhileDoneCompletion(completion) { @Override public void done(ErrorCodeList errorCodeList) { - if (!errList.getCauses().isEmpty()) { - completion.fail(errList.getCauses().get(0)); + if (!errorCodeList.getCauses().isEmpty()) { + completion.fail(errorCodeList.getCauses().get(0)); - inventories.forEach(snapshot -> { - VolumeSnapshotDeletionMsg msg = new VolumeSnapshotDeletionMsg(); - msg.setSnapshotUuid(snapshot.getUuid()); - msg.setTreeUuid(snapshot.getTreeUuid()); - msg.setVolumeUuid(snapshot.getVolumeUuid()); - msg.setScope(DeleteVolumeSnapshotScope.Single.toString()); - msg.setDirection(DeleteVolumeSnapshotDirection.Commit.toString()); - bus.makeTargetServiceIdByResourceUuid(msg, VolumeSnapshotConstant.SERVICE_ID, snapshot.getUuid()); - bus.send(msg); + inventories.forEach(snapshot -> { + VolumeSnapshotDeletionMsg delMsg = new VolumeSnapshotDeletionMsg(); + delMsg.setSnapshotUuid(snapshot.getUuid()); + delMsg.setTreeUuid(snapshot.getTreeUuid()); + delMsg.setVolumeUuid(snapshot.getVolumeUuid()); + delMsg.setScope(DeleteVolumeSnapshotScope.Single.toString()); + delMsg.setDirection(DeleteVolumeSnapshotDirection.Commit.toString()); + bus.makeTargetServiceIdByResourceUuid(delMsg, VolumeSnapshotConstant.SERVICE_ID, snapshot.getUuid()); + bus.send(delMsg); }); return; } completion.success(); } });
🧹 Nitpick comments (3)
storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.java (1)
369-371: 布尔运算使用习惯:建议用 || 而非 |
shareable | vol.isShareable()工作正常,但不短路且不符合常见风格;推荐改为shareable || vol.isShareable()提升可读性。storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java (2)
2092-2109: 避免重复队列任务:对 groupUuids 去重当前直接收集 groupUuid 后逐个排队,若同一组出现多次会重复入队,增加无谓开销并可能触发重复删除。建议 distinct():
- List<String> groupUuids = snapshots.stream().map(VolumeSnapshotInventory::getGroupUuid).filter(Objects::nonNull).collect(Collectors.toList()); + List<String> groupUuids = snapshots.stream() + .map(VolumeSnapshotInventory::getGroupUuid) + .filter(Objects::nonNull) + .distinct() + .collect(Collectors.toList());
2111-2122: 空集合防御:访问 groupVO.getVolumeSnapshotRefs() 前做空保护部分 ORM 映射可能返回 null 列表;为避免 NPE,先转为空集合再 allMatch:
- if (!groupVO.getVolumeSnapshotRefs().stream().allMatch(VolumeSnapshotGroupRefVO::isSnapshotDeleted)) { + List<VolumeSnapshotGroupRefVO> refs = Optional.ofNullable(groupVO.getVolumeSnapshotRefs()) + .orElse(Collections.emptyList()); + if (!refs.stream().allMatch(VolumeSnapshotGroupRefVO::isSnapshotDeleted)) { logger.debug(String.format("skipping ungroup operation for snapshot group[uuid:%s]: " + "no group meet deletion criteria (due to remaining volume snapshots).", groupVO.getUuid())); completion.done(); return; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java(3 hunks)storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.java(3 hunks)storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.java
⚙️ CodeRabbit configuration file
**/*.java: ## 1. API 设计要求
- API 命名:
- API 名称必须唯一,不能重复。
- API 消息类需要继承
APIMessage;其返回类必须继承APIReply或APIEvent,并在注释中用@RestResponse进行标注。- API 消息上必须添加注解
@RestRequest,并满足如下规范:
path:
- 针对资源使用复数形式。
- 当 path 中引用消息类变量时,使用
{variableName}格式。- HTTP 方法对应:
- 查询操作 →
HttpMethod.GET- 更新操作 →
HttpMethod.PUT- 创建操作 →
HttpMethod.POST- 删除操作 →
HttpMethod.DELETE- API 类需要实现
__example__方法以便生成 API 文档,并确保生成对应的 Groovy API Template 与 API Markdown 文件。
2. 命名与格式规范
类名:
- 使用 UpperCamelCase 风格。
- 特殊情况:
- VO/AO/EO 类型类除外。
- 抽象类采用
Abstract或Base前缀/后缀。- 异常类应以
Exception结尾。- 测试类需要以
Test或Case结尾。方法名、参数名、成员变量和局部变量:
- 使用 lowerCamelCase 风格。
常量命名:
- 全部大写,使用下划线分隔单词。
- 要求表达清楚,避免使用含糊或不准确的名称。
包名:
- 统一使用小写,使用点分隔符,每个部分应是一个具有自然语义的英文单词(参考 Spring 框架的结构)。
命名细节:
- 避免在父子类或同一代码块中出现相同名字的成员或局部变量,防止混淆。
- 命名缩写:
- 不允许使用不必要的缩写,如:
AbsSchedulerJob、condi、Fu等。应使用完整单词提升可读性。
3. 编写自解释代码
意图表达:
- 避免使用布尔型参数造成含义不明确。例如:
- 对于
stopAgent(boolean ignoreError),建议拆分为不同函数(如stopAgentIgnoreError()),或使用枚举表达操作类型。- 命名应尽量用完整的单词组合表达意图,并在名称中体现数据类型或用途(例如在常量与变量名称中,将类型词放在末尾)。
注释:
- 代码应尽量做到自解释,对少于两行的说明可以直接写在代码中。
- 对于较长的注释,需要仔细校对并随代码更新,确保内容正确。
- 接口方法不应有多余的修饰符(例如
public),且必须配有有效的 Javadoc 注释。
4. 流程控制和结构优化
if...else 的使用:
- 应尽量减少 if...else 结构的使用,建议:
- 限制嵌套层级最多为两层,且内层不应再出现
else分支。- 尽早返回(Early Return),将条件判断中的处理逻辑提前结束或抽成独立方法。
- 使用 Java Stream 或 Lambda 表达式代替冗长的循环与条件判断。
条件判断:
- if 条件表达不宜过长或过于复杂,必要时可以将条件抽成 boolean 变量描述。
代码块长度:
- 单个 if 代码块不宜超过一屏显示,以提高可读性和后续维护性。
5. 异常处理与日志
- 捕获异常的原则:
- 对于可以通过预检查避免的 RuntimeException(如
NullPointerException、IndexOutOfBoundsException等),不建议使用 try-catch 来进行处理。- 捕获异常应仅用于处理真正的意外情况,不应将异常逻辑当作正常流程控制。
- 在必要时,应继续抛出异常,使上层业务处理者可以转换为用户友好的错误提示。
- 使用 try-with-resources 语法管理资源,确保在 finally 块中正确关闭资源,并避免在 finally 中返回值。
...
Files:
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.javastorage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.javaplugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java
🧠 Learnings (8)
📓 Common learnings
Learnt from: ZStack-Robot
Repo: MatheMatrix/zstack PR: 2496
File: storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java:1218-1224
Timestamp: 2025-08-24T06:33:10.771Z
Learning: ZStack团队在容量管理相关问题上响应迅速,当发现Pull快照流程中申请容量与释放容量不匹配的问题时,开发人员会及时进行修复以确保主存储容量核算的准确性。
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2360
File: network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java:449-490
Timestamp: 2025-08-04T04:48:19.103Z
Learning: ZStack项目在cherry-pick操作中,即使发现了性能优化机会(如IP地址批量保存的内存优化),也严格遵循不做额外修改的政策,优先保证cherry-pick的完整性和一致性。
📚 Learning: 2025-08-25T03:52:37.301Z
Learnt from: ZStack-Robot
Repo: MatheMatrix/zstack PR: 2504
File: storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java:775-819
Timestamp: 2025-08-25T03:52:37.301Z
Learning: 在ZStack项目的VolumeSnapshotTreeBase类中,当通过dbf.findByUuid()方法获取VolumeVO对象时,需要进行null检查,因为该方法可能在找不到对应记录时返回null,直接使用可能导致NullPointerException。
Applied to files:
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
📚 Learning: 2025-08-22T06:31:57.406Z
Learnt from: ZStack-Robot
Repo: MatheMatrix/zstack PR: 2489
File: storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java:179-179
Timestamp: 2025-08-22T06:31:57.406Z
Learning: 在ZStack项目的VolumeSnapshotAO类中,团队决定不为treeUuid字段添加Index注解,即使这可能会影响查询性能。团队优先考虑代码稳定性而非性能优化。
Applied to files:
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
📚 Learning: 2024-06-10T19:31:27.994Z
Learnt from: AlanJager
Repo: MatheMatrix/zstack PR: 175
File: storage/src/main/java/org/zstack/storage/volume/VolumeBase.java:31-38
Timestamp: 2024-06-10T19:31:27.994Z
Learning: The user has clarified that the `MemorySnapshotGroupExtensionPoint` has been removed and its implementation has been moved to `VolumeSnapshotCreationExtensionPoint`.
Applied to files:
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
📚 Learning: 2025-08-14T06:56:19.585Z
Learnt from: zstack-robot-2
Repo: MatheMatrix/zstack PR: 2435
File: storage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.java:47-47
Timestamp: 2025-08-14T06:56:19.585Z
Learning: 在VolumeSnapshotGroupBase.java中,VmInstanceResourceMetadataManager的注入和SKIP_RESOURCE_ROLLBACK标记虽然在当前版本中未被使用,但这些导入在大型重构PR中是有意为之的,用于保持代码一致性或为后续功能做准备。
Applied to files:
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.javastorage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.javaplugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java
📚 Learning: 2025-08-24T07:56:17.047Z
Learnt from: MatheMatrix
Repo: MatheMatrix/zstack PR: 2496
File: storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotCascadeExtension.java:71-73
Timestamp: 2025-08-24T07:56:17.047Z
Learning: In VolumeSnapshotCascadeExtension, the createActionForChildResource method always wraps the context in VolumeSnapshotDeletionStructs when creating child actions, so no backward compatibility with List<VolumeSnapshotInventory> is needed in handleDeletionCleanup.
Applied to files:
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.javastorage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.javaplugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java
📚 Learning: 2025-08-14T06:48:00.549Z
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2435
File: header/src/main/java/org/zstack/header/storage/snapshot/group/APICheckMemorySnapshotGroupConflictMsg.java:3-7
Timestamp: 2025-08-14T06:48:00.549Z
Learning: In the ZStack project, API message classes may not require explicit imports for VolumeSnapshotGroupVO references in APIParam annotations or for the uuid() helper method used in __example__() methods, even when these classes/methods are defined in other packages. The build system or framework appears to handle these references automatically.
Applied to files:
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
📚 Learning: 2025-08-25T03:55:07.988Z
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2504
File: plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackend.java:2046-2078
Timestamp: 2025-08-25T03:55:07.988Z
Learning: The OFFLINE_SNAPSHOT_COMMIT path for NFS primary storage snapshot commit operations is properly handled in the test infrastructure via NfsPrimaryStorageSpec.groovy, which includes both simulator and VFS hook implementations for testing the offline snapshot commit functionality.
Applied to files:
storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.javaplugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java
🧬 Code graph analysis (1)
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java (1)
core/src/main/java/org/zstack/core/db/Q.java (1)
Q(16-193)
…ccessfully created snapshots Resolves: ZSV-9792 Change-Id: I6b65736e646e7163777a7872667077756771726d
b031cdb to
632d824
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java (1)
224-247: 建议重构以提升代码可读性和遵循编码规范当前实现逻辑正确,但存在以下可改进之处:
避免使用数组作为可变持有者(Line 235):
final String[] ungroupUuid = new String[1]是 Java 中的反模式。建议使用更清晰的方式,如提前返回或使用AtomicReference。简化嵌套循环(Lines 236-243):嵌套的
forEach使代码可读性下降。建议使用 Stream API 的anyMatch或提前return/break来简化逻辑。符合编码规范中的流程控制要求:根据编码指南,应尽量减少嵌套层级,优先使用 Early Return 模式。
可考虑如下重构:
private void validate(APIDeleteVolumeSnapshotGroupMsg msg) { VolumeSnapshotGroupVO groupVO = dbf.findByUuid(msg.getUuid(), VolumeSnapshotGroupVO.class); - // 获取当前虚拟机所有内存快照 - // 检测内存快照是否完整 - // 1 完整 允许删除 - // 2 不完整 不允许删除 + List<VolumeSnapshotGroupVO> groups = Q.New(VolumeSnapshotGroupVO.class) .eq(VolumeSnapshotGroupVO_.vmInstanceUuid, groupVO.getVmInstanceUuid()) .orderByAsc(VolumeSnapshotGroupVO_.createDate) .list(); - final String[] ungroupUuid = new String[1]; - groups.forEach(group -> { - Set<VolumeSnapshotGroupRefVO> volumeSnapshotRefs = group.getVolumeSnapshotRefs(); - volumeSnapshotRefs.forEach(ref -> { - if (ref.isSnapshotDeleted()) { - ungroupUuid[0] = group.getUuid(); - } - }); - }); - if (ungroupUuid[0] != null) { - throw new ApiMessageInterceptionException(argerr("volume snapshot group[uuid:%s] is not complete, cannot delete volume snapshot group", ungroupUuid[0])); + for (VolumeSnapshotGroupVO group : groups) { + boolean hasDeletedSnapshot = group.getVolumeSnapshotRefs().stream() + .anyMatch(VolumeSnapshotGroupRefVO::isSnapshotDeleted); + if (hasDeletedSnapshot) { + throw new ApiMessageInterceptionException(argerr( + "volume snapshot group[uuid:%s] is not complete, cannot delete volume snapshot group", + group.getUuid())); + } } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java(3 hunks)storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.java(3 hunks)storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java(3 hunks)storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.java
⚙️ CodeRabbit configuration file
**/*.java: ## 1. API 设计要求
- API 命名:
- API 名称必须唯一,不能重复。
- API 消息类需要继承
APIMessage;其返回类必须继承APIReply或APIEvent,并在注释中用@RestResponse进行标注。- API 消息上必须添加注解
@RestRequest,并满足如下规范:
path:
- 针对资源使用复数形式。
- 当 path 中引用消息类变量时,使用
{variableName}格式。- HTTP 方法对应:
- 查询操作 →
HttpMethod.GET- 更新操作 →
HttpMethod.PUT- 创建操作 →
HttpMethod.POST- 删除操作 →
HttpMethod.DELETE- API 类需要实现
__example__方法以便生成 API 文档,并确保生成对应的 Groovy API Template 与 API Markdown 文件。
2. 命名与格式规范
类名:
- 使用 UpperCamelCase 风格。
- 特殊情况:
- VO/AO/EO 类型类除外。
- 抽象类采用
Abstract或Base前缀/后缀。- 异常类应以
Exception结尾。- 测试类需要以
Test或Case结尾。方法名、参数名、成员变量和局部变量:
- 使用 lowerCamelCase 风格。
常量命名:
- 全部大写,使用下划线分隔单词。
- 要求表达清楚,避免使用含糊或不准确的名称。
包名:
- 统一使用小写,使用点分隔符,每个部分应是一个具有自然语义的英文单词(参考 Spring 框架的结构)。
命名细节:
- 避免在父子类或同一代码块中出现相同名字的成员或局部变量,防止混淆。
- 命名缩写:
- 不允许使用不必要的缩写,如:
AbsSchedulerJob、condi、Fu等。应使用完整单词提升可读性。
3. 编写自解释代码
意图表达:
- 避免使用布尔型参数造成含义不明确。例如:
- 对于
stopAgent(boolean ignoreError),建议拆分为不同函数(如stopAgentIgnoreError()),或使用枚举表达操作类型。- 命名应尽量用完整的单词组合表达意图,并在名称中体现数据类型或用途(例如在常量与变量名称中,将类型词放在末尾)。
注释:
- 代码应尽量做到自解释,对少于两行的说明可以直接写在代码中。
- 对于较长的注释,需要仔细校对并随代码更新,确保内容正确。
- 接口方法不应有多余的修饰符(例如
public),且必须配有有效的 Javadoc 注释。
4. 流程控制和结构优化
if...else 的使用:
- 应尽量减少 if...else 结构的使用,建议:
- 限制嵌套层级最多为两层,且内层不应再出现
else分支。- 尽早返回(Early Return),将条件判断中的处理逻辑提前结束或抽成独立方法。
- 使用 Java Stream 或 Lambda 表达式代替冗长的循环与条件判断。
条件判断:
- if 条件表达不宜过长或过于复杂,必要时可以将条件抽成 boolean 变量描述。
代码块长度:
- 单个 if 代码块不宜超过一屏显示,以提高可读性和后续维护性。
5. 异常处理与日志
- 捕获异常的原则:
- 对于可以通过预检查避免的 RuntimeException(如
NullPointerException、IndexOutOfBoundsException等),不建议使用 try-catch 来进行处理。- 捕获异常应仅用于处理真正的意外情况,不应将异常逻辑当作正常流程控制。
- 在必要时,应继续抛出异常,使上层业务处理者可以转换为用户友好的错误提示。
- 使用 try-with-resources 语法管理资源,确保在 finally 块中正确关闭资源,并避免在 finally 中返回值。
...
Files:
storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.javaplugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.javastorage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.javastorage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java
🧠 Learnings (11)
📓 Common learnings
Learnt from: ZStack-Robot
Repo: MatheMatrix/zstack PR: 2496
File: storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java:1218-1224
Timestamp: 2025-08-24T06:33:10.771Z
Learning: ZStack团队在容量管理相关问题上响应迅速,当发现Pull快照流程中申请容量与释放容量不匹配的问题时,开发人员会及时进行修复以确保主存储容量核算的准确性。
Learnt from: zstack-robot-2
Repo: MatheMatrix/zstack PR: 2435
File: storage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.java:47-47
Timestamp: 2025-08-14T06:56:19.585Z
Learning: 在VolumeSnapshotGroupBase.java中,VmInstanceResourceMetadataManager的注入和SKIP_RESOURCE_ROLLBACK标记虽然在当前版本中未被使用,但这些导入在大型重构PR中是有意为之的,用于保持代码一致性或为后续功能做准备。
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2360
File: network/src/main/java/org/zstack/network/l3/L3BasicNetwork.java:449-490
Timestamp: 2025-08-04T04:48:19.103Z
Learning: ZStack项目在cherry-pick操作中,即使发现了性能优化机会(如IP地址批量保存的内存优化),也严格遵循不做额外修改的政策,优先保证cherry-pick的完整性和一致性。
📚 Learning: 2025-08-25T03:55:07.988Z
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2504
File: plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackend.java:2046-2078
Timestamp: 2025-08-25T03:55:07.988Z
Learning: The OFFLINE_SNAPSHOT_COMMIT path for NFS primary storage snapshot commit operations is properly handled in the test infrastructure via NfsPrimaryStorageSpec.groovy, which includes both simulator and VFS hook implementations for testing the offline snapshot commit functionality.
Applied to files:
storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.javaplugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java
📚 Learning: 2025-08-24T07:56:17.047Z
Learnt from: MatheMatrix
Repo: MatheMatrix/zstack PR: 2496
File: storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotCascadeExtension.java:71-73
Timestamp: 2025-08-24T07:56:17.047Z
Learning: In VolumeSnapshotCascadeExtension, the createActionForChildResource method always wraps the context in VolumeSnapshotDeletionStructs when creating child actions, so no backward compatibility with List<VolumeSnapshotInventory> is needed in handleDeletionCleanup.
Applied to files:
storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.javaplugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.javastorage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.javastorage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java
📚 Learning: 2025-08-14T06:56:19.585Z
Learnt from: zstack-robot-2
Repo: MatheMatrix/zstack PR: 2435
File: storage/src/main/java/org/zstack/storage/snapshot/group/VolumeSnapshotGroupBase.java:47-47
Timestamp: 2025-08-14T06:56:19.585Z
Learning: 在VolumeSnapshotGroupBase.java中,VmInstanceResourceMetadataManager的注入和SKIP_RESOURCE_ROLLBACK标记虽然在当前版本中未被使用,但这些导入在大型重构PR中是有意为之的,用于保持代码一致性或为后续功能做准备。
Applied to files:
plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.javastorage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.javastorage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java
📚 Learning: 2024-06-10T19:31:27.994Z
Learnt from: AlanJager
Repo: MatheMatrix/zstack PR: 175
File: storage/src/main/java/org/zstack/storage/volume/VolumeBase.java:31-38
Timestamp: 2024-06-10T19:31:27.994Z
Learning: The user has clarified that the `MemorySnapshotGroupExtensionPoint` has been removed and its implementation has been moved to `VolumeSnapshotCreationExtensionPoint`.
Applied to files:
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
📚 Learning: 2025-08-25T03:52:37.301Z
Learnt from: ZStack-Robot
Repo: MatheMatrix/zstack PR: 2504
File: storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java:775-819
Timestamp: 2025-08-25T03:52:37.301Z
Learning: 在ZStack项目的VolumeSnapshotTreeBase类中,当通过dbf.findByUuid()方法获取VolumeVO对象时,需要进行null检查,因为该方法可能在找不到对应记录时返回null,直接使用可能导致NullPointerException。
Applied to files:
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.javastorage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java
📚 Learning: 2025-08-22T06:31:57.406Z
Learnt from: ZStack-Robot
Repo: MatheMatrix/zstack PR: 2489
File: storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java:179-179
Timestamp: 2025-08-22T06:31:57.406Z
Learning: 在ZStack项目的VolumeSnapshotAO类中,团队决定不为treeUuid字段添加Index注解,即使这可能会影响查询性能。团队优先考虑代码稳定性而非性能优化。
Applied to files:
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
📚 Learning: 2025-08-14T06:48:00.549Z
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2435
File: header/src/main/java/org/zstack/header/storage/snapshot/group/APICheckMemorySnapshotGroupConflictMsg.java:3-7
Timestamp: 2025-08-14T06:48:00.549Z
Learning: In the ZStack project, API message classes may not require explicit imports for VolumeSnapshotGroupVO references in APIParam annotations or for the uuid() helper method used in __example__() methods, even when these classes/methods are defined in other packages. The build system or framework appears to handle these references automatically.
Applied to files:
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java
📚 Learning: 2025-06-19T10:34:39.243Z
Learnt from: MatheMatrix
Repo: MatheMatrix/zstack PR: 2199
File: plugin/securityGroup/src/main/java/org/zstack/network/securitygroup/SecurityGroupApiInterceptor.java:745-745
Timestamp: 2025-06-19T10:34:39.243Z
Learning: 在 SecurityGroupApiInterceptor.java 中,所有来自 API Message 的外部字符串参数(如 IP 范围、端口范围、安全组 UUID、描述等)都应当在 validate 方法中进行 trim 处理,以防止用户在浏览器中复制粘贴带有空格、换行符、回车符等不可见字符的数据影响验证逻辑。
Applied to files:
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java
📚 Learning: 2025-06-19T10:34:39.243Z
Learnt from: MatheMatrix
Repo: MatheMatrix/zstack PR: 2199
File: plugin/securityGroup/src/main/java/org/zstack/network/securitygroup/SecurityGroupApiInterceptor.java:745-745
Timestamp: 2025-06-19T10:34:39.243Z
Learning: SecurityGroupApiInterceptor.java 中需要对以下类型的外部字符串参数进行 trim 处理:1) IP相关字段(getAllowedCidr, getSrcIpRange, getDstIpRange) 2) 端口字段(getDstPortRange) 3) UUID字段(getRemoteSecurityGroupUuid等) 4) 描述字段(getDescription) 5) 枚举字符串字段(getProtocol, getAction, getState, getType)。这些参数在进入 validateIps, validatePorts 以及所有 validate 方法之前都应当被 trim,以防止空格、换行符等不可见字符影响验证逻辑。
Applied to files:
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java
📚 Learning: 2025-04-21T03:20:20.087Z
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2024
File: compute/src/main/java/org/zstack/compute/vm/VmInstanceApiInterceptor.java:373-388
Timestamp: 2025-04-21T03:20:20.087Z
Learning: 在 ZStack 中,验证逻辑应该抛出 `ApiMessageInterceptionException` 而不是 `OperationFailureException`,以保持异常处理的一致性。这特别适用于 VM 系统的参数验证过程。
Applied to files:
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java
🧬 Code graph analysis (2)
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java (1)
core/src/main/java/org/zstack/core/db/Q.java (1)
Q(16-193)
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java (1)
core/src/main/java/org/zstack/core/db/Q.java (1)
Q(16-193)
🔇 Additional comments (8)
storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorageFactory.java (2)
540-540: LGTM!快照库存收集机制设计合理使用同步列表收集并行创建的快照,为失败场景下的回滚清理提供了必要的上下文信息。这与 PR 目标中"快照组创建失败时删除已成功创建的快照"的需求相符。
587-598: 回滚清理逻辑正确,符合错误处理最佳实践在任一快照创建失败时,遍历已成功创建的快照并发送删除消息进行清理。使用"fire-and-forget"模式发送清理消息是合理的,因为这是尽力而为的清理操作。删除消息的参数设置(Scope=Single, Direction=Commit)符合单个快照清理的语义。
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotApiInterceptor.java (2)
34-34: 导入正确新增的
Set导入用于新增的验证逻辑中,符合需求。
78-79: 消息路由正确为
APIDeleteVolumeSnapshotGroupMsg添加验证路由,与现有的消息拦截模式保持一致。storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java (2)
2087-2109: 异步队列化重构设计优秀将快照组取消分组操作重构为基于队列的异步处理模式,具有以下优点:
- 提前退出优化(Lines 2092-2095):在没有组 UUID 时立即返回,避免不必要的处理。
- 每组独立任务(Lines 2098-2109):每个快照组使用独立的队列任务(
ungroup-volumeSnapshotGroup-<uuid>),提升并发处理能力。- 与 PR 整体设计一致:配合 API 拦截器中的完整性校验和主存储的回滚清理,形成完整的快照组生命周期管理。
这种重构将同步的直接删除替换为异步的按组处理,提高了系统的可扩展性和健壮性。
2111-2122: 每组清理逻辑正确,遵循最佳实践新增的
ungroupAfterDeleted(VolumeSnapshotGroupVO groupVO, NoErrorCompletion completion)方法实现合理:
- 条件检查清晰(Line 2112):使用
allMatch()判断组内所有快照是否已删除,代码简洁易懂。- 提前返回模式(Lines 2113-2116):遵循编码规范中的 Early Return 原则,在不满足删除条件时立即返回。
- 完整的清理流程(Lines 2118-2121):先删除归档元数据,再删除数据库记录,顺序合理。
该方法与重构后的队列化调用配合良好,确保了快照组的正确清理。
plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java (2)
745-745: 线程安全的快照收集列表初始化正确使用
Collections.synchronizedList确保了并行快照创建过程中的线程安全。
783-783: 成功快照收集逻辑正确在快照创建成功后正确地将快照信息添加到收集列表中,为后续失败时的清理操作提供依据。
|
|
||
| inventories.forEach(snapshot -> { | ||
| VolumeSnapshotDeletionMsg msg = new VolumeSnapshotDeletionMsg(); | ||
| msg.setSnapshotUuid(snapshot.getUuid()); | ||
| msg.setTreeUuid(snapshot.getTreeUuid()); | ||
| msg.setVolumeUuid(snapshot.getVolumeUuid()); | ||
| msg.setScope(DeleteVolumeSnapshotScope.Single.toString()); | ||
| msg.setDirection(DeleteVolumeSnapshotDirection.Commit.toString()); | ||
| bus.makeTargetServiceIdByResourceUuid(msg, VolumeSnapshotConstant.SERVICE_ID, snapshot.getUuid()); | ||
| bus.send(msg); | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
建议为快照清理操作添加错误处理和日志记录
当前的清理逻辑采用"发送后不管"(fire-and-forget)方式,存在以下风险:
- 无法验证快照删除是否成功
- 如果删除失败,可能导致孤立快照残留
- 缺少日志记录,难以追踪清理操作的执行情况
建议改进:
if (!errList.getCauses().isEmpty()) {
completion.fail(errList.getCauses().get(0));
+ logger.warn(String.format("snapshot creation failed, cleaning up %d successfully created snapshots", inventories.size()));
inventories.forEach(snapshot -> {
VolumeSnapshotDeletionMsg msg = new VolumeSnapshotDeletionMsg();
msg.setSnapshotUuid(snapshot.getUuid());
msg.setTreeUuid(snapshot.getTreeUuid());
msg.setVolumeUuid(snapshot.getVolumeUuid());
msg.setScope(DeleteVolumeSnapshotScope.Single.toString());
msg.setDirection(DeleteVolumeSnapshotDirection.Commit.toString());
bus.makeTargetServiceIdByResourceUuid(msg, VolumeSnapshotConstant.SERVICE_ID, snapshot.getUuid());
- bus.send(msg);
+ bus.send(msg, new CloudBusCallBack(null) {
+ @Override
+ public void run(MessageReply reply) {
+ if (!reply.isSuccess()) {
+ logger.warn(String.format("failed to delete snapshot[uuid:%s] during cleanup: %s",
+ snapshot.getUuid(), reply.getError()));
+ } else {
+ logger.debug(String.format("successfully deleted snapshot[uuid:%s] during cleanup",
+ snapshot.getUuid()));
+ }
+ }
+ });
});
return;
}这样可以:
- 记录清理操作的执行情况
- 捕获并记录删除失败的错误
- 提高系统的可观测性和可维护性
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In
plugin/ceph/src/main/java/org/zstack/storage/ceph/primary/CephPrimaryStorageFactory.java
around lines 792 to 802, the snapshot deletion operation uses a fire-and-forget
approach by calling bus.send(msg) without handling responses or adding logging,
which makes it impossible to verify deletion success or diagnose failures.
Modify this code to add error handling and logging: replace the simple
bus.send(msg) call with bus.send(msg, new AbstractMessageListener() {...}) to
add a response handler that logs successful deletions and captures any errors
that occur, ensuring you also add a log statement before sending to record that
the deletion operation was initiated.
Resolves: ZSV-9792
Change-Id: I6b65736e646e7163777a7872667077756771726d
sync from gitlab !8697