-
Notifications
You must be signed in to change notification settings - Fork 144
Idempotent Workflow Bug: TransferMoney Function May Cause Duplicate Transfers #16
Description
The current implementation of the TransferMoney function may result in duplicate transfers when a software, hardware, or network error occurs during its execution. Each time the function is invoked by an Event Hub message, it writes two records to CosmosDB with debitFrom and creditTo information, using newly generated GUIDs as primary keys (TransferId). If there's a failure after a record is persisted to CosmosDB, the function will be retried, creating new debitFrom and creditTo records with distinct primary keys and causing the money in the "debitFrom" account to be deducted twice.
Steps to reproduce:
- Invoke the TransferMoney function using an Event Hub message.
- The function writes the debitFrom record to CosmosDB.
- An error occurs after the record is persisted to CosmosDB (can be simulated through IDE breakpoint or throwing an exception).
- Event Hub retries the TransferMoney function from the beginning.
- The second execution creates new debitFrom and creditTo records with different primary keys and inserts them into CosmosDB.
Expected behavior:
When the TransferMoney function is retried, it should not create duplicate transfers, causing unintended deductions or credits to the accounts involved.
Actual behavior:
If a failure happens after the first AddAsync call causing the function host to crash and restart, the money has been deducted from the "from" account. When the TransferMoney is retried, it deducts the money from the "from" account a second time and credits the "to" account the first time. If the failure happens after both AddAsync calls, money will be deducted twice from the "from" account and credited twice to the "to" account.
Suggested fix:
Make the primary keys of the two records deterministic across retries. For example, use transactionId + transaction.AccountFromId as the primary key of the debitFrom record, and use transactionId + transaction.toAccountId as the primary key of the creditTo record. This way, no matter how many times AddAsync is called, only two records will be entered into CosmosDB, avoiding duplicate transfers.
Note: CosmosDBAsyncCollector.AddAsync makes adding/updating happen right away when it is called but does not wait until the user code region finishes. Even if these two records are added after the user code exits, if the function host crashes in the small window before checkpointing to EventHub to record a successful function call, a retry would still happen, causing two more records to be added to CosmosDb.