Introduction to Distributed Transactions

When working with microservices, ensuring data consistency across multiple services can be a daunting task. Distributed transactions are a way to manage this complexity, but they come with their own set of challenges. In this article, we’ll delve into the world of distributed transactions, specifically focusing on the two-phase commit (2PC) mechanism in Go.

Why Distributed Transactions?

In a microservice architecture, each service might have its own database or storage system. When a transaction involves multiple services, ensuring that either all changes are committed or none are, is crucial for maintaining data consistency. For example, in an e-commerce application, when a user places an order, the system needs to update both the user’s account and the order database. If one update fails, the other should be rolled back to maintain consistency[1][3].

Two-Phase Commit (2PC)

The two-phase commit is a protocol designed to ensure atomicity in distributed transactions. Here’s a high-level overview of how it works:

Phase 1: Prepare

In the prepare phase, each participant (service) is asked to prepare for the transaction. If any participant fails to prepare, the transaction is aborted.

Phase 2: Commit or Rollback

If all participants successfully prepare, the transaction coordinator sends a commit message. If any participant fails during the prepare phase, a rollback message is sent.

Implementing 2PC in Go

To implement 2PC in Go, we can use the DTM (Distributed Transaction Manager) framework, which provides a robust way to handle distributed transactions.

Setting Up DTM

First, you need to set up the DTM framework. Here’s how you can do it:

git clone https://github.com/dtm-labs/dtm && cd dtm
go run main.go

Basic Example

Let’s consider a simple example of transferring funds between two accounts. We’ll have two services: TransOut and TransIn.

Database Schema

CREATE TABLE IF NOT EXISTS user_account (
    id INT(11) PRIMARY KEY AUTO_INCREMENT,
    user_id INT(11) UNIQUE,
    balance DECIMAL(10, 2) NOT NULL DEFAULT '0',
    trading_balance DECIMAL(10, 2) NOT NULL DEFAULT '0',
    create_time DATETIME DEFAULT NOW(),
    update_time DATETIME DEFAULT NOW(),
    KEY(create_time),
    KEY(update_time)
);

Try/Confirm/Cancel Handlers

In the 2PC context, we need to define Try, Confirm, and Cancel handlers for each service.

func transOutTry(ctx context.Context, req *busi.BusiReq) (interface{}, error) {
    // Freeze the amount in the sender's account
    sql := "UPDATE user_account SET trading_balance = trading_balance + ? WHERE user_id = ? AND balance + trading_balance >= 0"
    _, err := db.Exec(sql, req.Amount, req.UserID)
    if err != nil {
        return nil, err
    }
    return nil, nil
}

func transOutConfirm(ctx context.Context, req *busi.BusiReq) (interface{}, error) {
    // Commit the transaction by moving the frozen amount to the recipient
    sql := "UPDATE user_account SET balance = balance - ?, trading_balance = trading_balance - ? WHERE user_id = ?"
    _, err := db.Exec(sql, req.Amount, req.Amount, req.UserID)
    if err != nil {
        return nil, err
    }
    return nil, nil
}

func transOutCancel(ctx context.Context, req *busi.BusiReq) (interface{}, error) {
    // Rollback the transaction by unfreezing the amount
    sql := "UPDATE user_account SET trading_balance = trading_balance - ? WHERE user_id = ?"
    _, err := db.Exec(sql, req.Amount, req.UserID)
    if err != nil {
        return nil, err
    }
    return nil, nil
}

// Similar handlers for TransIn service

Creating the 2PC Transaction

func main() {
    // Create a new transaction
    trans, err := dtmcli.NewTrans(dtmcli.MustGenGid(), dtmcli.DefaultConf, "transOut", "transIn")
    if err != nil {
        log.Fatal(err)
    }

    // Add branches for each service
    trans.AddBranch(&dtmcli.Branch{
        Op:     "transOutTry",
        Target: "http://trans-out-service/try",
    })
    trans.AddBranch(&dtmcli.Branch{
        Op:     "transInTry",
        Target: "http://trans-in-service/try",
    })
    trans.AddBranch(&dtmcli.Branch{
        Op:     "transOutConfirm",
        Target: "http://trans-out-service/confirm",
    })
    trans.AddBranch(&dtmcli.Branch{
        Op:     "transInConfirm",
        Target: "http://trans-in-service/confirm",
    })
    trans.AddBranch(&dtmcli.Branch{
        Op:     "transOutCancel",
        Target: "http://trans-out-service/cancel",
    })
    trans.AddBranch(&dtmcli.Branch{
        Op:     "transInCancel",
        Target: "http://trans-in-service/cancel",
    })

    // Start the transaction
    err = trans.Submit()
    if err != nil {
        log.Fatal(err)
    }
}

Handling Network Exceptions

One of the critical aspects of distributed transactions is handling network exceptions. DTM provides a BranchBarrier utility to ensure idempotency and handle network failures.

func (bb *BranchBarrier) Call(tx *sql.Tx, busiCall BarrierBusiFunc) error {
    // This ensures the operation inside this function will be called at most once
    return busiCall(tx)
}

Rollback Mechanism

If any part of the transaction fails, DTM will automatically invoke the cancel operations to roll back the transaction.

sequenceDiagram participant Coordinator as Transaction Coordinator participant ServiceA as TransOut Service participant ServiceB as TransIn Service Note over Coordinator,ServiceB: Start Transaction Coordinator->>ServiceA: Prepare (transOutTry) ServiceA->>Coordinator: Prepared Coordinator->>ServiceB: Prepare (transInTry) ServiceB->>Coordinator: Prepared Note over Coordinator,ServiceB: All services prepared successfully Coordinator->>ServiceA: Commit (transOutConfirm) ServiceA->>Coordinator: Committed Coordinator->>ServiceB: Commit (transInConfirm) ServiceB->>Coordinator: Committed Note over Coordinator,ServiceB: Transaction committed successfully alt Failure during prepare phase Coordinator->>ServiceA: Prepare (transOutTry) ServiceA->>Coordinator: Prepared Coordinator->>ServiceB: Prepare (transInTry) ServiceB->>Coordinator: Failed Note over Coordinator,ServiceB: One service failed to prepare Coordinator->>ServiceA: Rollback (transOutCancel) ServiceA->>Coordinator: Rolled back Coordinator->>ServiceB: Rollback (transInCancel) ServiceB->>Coordinator: Rolled back Note over Coordinator,ServiceB: Transaction rolled back successfully end

Best Practices and Alternatives

Avoiding Distributed Transactions When Possible

While distributed transactions are powerful, they can add significant complexity to your system. If possible, it’s better to avoid them by ensuring that your microservice boundaries are correctly defined and that each service manages its own consistency[1].

Embracing Eventual Consistency

In some cases, eventual consistency can be a simpler and more scalable solution. This involves using message queues and event sourcing to ensure that data is eventually consistent across services[1].

Using the Outbox Pattern

The outbox pattern is another approach to ensure consistency. It involves storing events in the same database transaction as the main data and then publishing these events asynchronously. This ensures that the event and the stored data are always consistent[1].

Conclusion

Distributed transactions are a complex but necessary aspect of microservice architecture. By using frameworks like DTM and following best practices such as handling network exceptions and considering alternatives like eventual consistency and the outbox pattern, you can ensure that your system maintains data consistency across multiple services.

Remember, distributed transactions are not a silver bullet, but with the right tools and mindset, they can be a powerful tool in your arsenal for building robust and scalable systems. So, the next time you’re faced with the challenge of managing data across multiple services, take a deep breath, grab your favorite coffee, and dive into the world of distributed transactions with confidence.