Building an A/B Testing System in Go: A Practical Guide to Smart Experimentation
So you want to run A/B tests in Go. Good news: it’s not rocket science. Bad news: it’s also not as simple as flipping a switch. But here’s the thing—once you’ve got it working, you’ll have superpowers to validate your ideas with real data instead of gut feelings. And that’s when things get interesting. In this article, we’ll build a complete A/B testing system from scratch. We’ll cover everything from setting up a basic Go application to deploying production-ready experiments that actually tell you something useful. No fluff, no hand-waving, just solid engineering.
Why A/B Testing Matters (And Why Go is Perfect for It)
Before we dive into code, let’s talk about why A/B testing isn’t just a buzzword thrown around by growth marketers. A/B testing lets you make data-driven decisions instead of gambling with your user experience. You can test which CTA button color actually converts better, whether users prefer longer or shorter form content, or if that new algorithm actually improves performance. Go? Go is a fantastic choice for this. It’s fast, concurrent, easy to deploy, and has excellent libraries for this kind of work. Plus, you can run your experimentation infrastructure on a potato compared to other languages. That matters when you’re scaling.
Architecture Overview: How It All Fits Together
Before we write a single line of code, let’s visualize how an A/B testing system actually works:
This flow is simple but powerful. A user comes in, we check if they’re in an experiment, we serve them the appropriate variant, and then we track what happens. Everything feeds into your analytics backend, which eventually tells you if your new idea actually works.
Prerequisites: What You’ll Need
Before we start, make sure you have:
- Go 1.19 or later (honestly, 1.20+ is better)
- A code editor (VSCode, GoLand, Vim—whatever keeps you sane)
- Basic understanding of HTTP and REST APIs
- A PostHog account (free tier is fine to start)
- 30 minutes and a decent coffee
Step 1: Create Your Foundation Go App
Let’s start with the simplest possible Go application that serves a web page. This is our launching point. First, create a new directory and initialize your Go module:
mkdir go-ab-tests
cd go-ab-tests
go mod init go-ab-tests
Now, create your main.go file with a basic HTTP server:
package main
import (
"fmt"
"net/http"
)
func main() {
http.HandleFunc("/", handleHome)
fmt.Println("Server starting on http://localhost:8080")
http.ListenAndServe(":8080", nil)
}
func handleHome(w http.ResponseWriter, r *http.Request) {
html := `
<!DOCTYPE html>
<html>
<head>
<title>A/B Testing Demo</title>
<style>
body { font-family: Arial, sans-serif; margin: 40px; }
.container { max-width: 600px; margin: 0 auto; }
</style>
</head>
<body>
<div class="container">
<h1>Welcome to A/B Testing</h1>
<p id="variant-text">Loading your experience...</p>
</div>
</body>
</html>
`
w.Header().Set("Content-Type", "text/html")
fmt.Fprint(w, html)
}
Run it with go run main.go and visit http://localhost:8080. You should see a basic page. Congratulations—you’re already 10% of the way there.
Step 2: Integrating PostHog: Your A/B Testing Brain
PostHog is where the magic happens. It manages your feature flags, tracks events, and does the statistical heavy lifting. Let’s integrate it into our Go app. First, install the PostHog SDK:
go get github.com/posthog/posthog-go
Now, update your main.go to initialize PostHog:
package main
import (
"fmt"
"log"
"net/http"
"os"
"github.com/posthog/posthog-go"
)
var client posthog.Client
func init() {
var err error
client, err = posthog.NewWithConfig(
"<YOUR_POSTHOG_API_KEY>",
posthog.Config{
Endpoint: "https://app.posthog.com", // or your self-hosted URL
},
)
if err != nil {
log.Fatal("Failed to initialize PostHog:", err)
}
}
func main() {
defer client.Close()
http.HandleFunc("/", handleHome)
fmt.Println("Server starting on http://localhost:8080")
http.ListenAndServe(":8080", nil)
}
func handleHome(w http.ResponseWriter, r *http.Request) {
// Capture a pageview event
client.Enqueue(posthog.Capture{
DistinctId: "placeholder-user-id",
Event: "$pageview",
Properties: posthog.NewProperties(),
})
html := `
<!DOCTYPE html>
<html>
<head>
<title>A/B Testing Demo</title>
<style>
body { font-family: Arial, sans-serif; margin: 40px; }
.container { max-width: 600px; margin: 0 auto; }
.variant-box {
border: 2px solid #007bff;
padding: 20px;
border-radius: 8px;
margin-top: 20px;
}
</style>
</head>
<body>
<div class="container">
<h1>Welcome to A/B Testing</h1>
<div class="variant-box">
<p id="variant-text">Loading your experience...</p>
</div>
</div>
</body>
</html>
`
w.Header().Set("Content-Type", "text/html")
fmt.Fprint(w, html)
}
Important: Get your PostHog API key from your project settings, and create a personal API key as well. We’ll need both.
Step 3: Creating Your First A/B Test in PostHog
Now here’s where things get real. You need to create the experiment in PostHog’s dashboard:
- Log into your PostHog instance
- Navigate to the A/B Testing tab
- Click New experiment
- Fill in the details:
- Name: “Homepage Button Test”
- Feature flag key:
homepage-button-test - Leave other fields as defaults
- Click Save as draft
- Set the primary metric to a trend of
$pageview - Click Launch You’ve just created your first experiment. Yes, it’s that straightforward. PostHog handles the heavy lifting of traffic allocation, statistical significance, and result analysis.
Step 4: Implementing Feature Flags in Your Go Code
This is where the actual A/B testing logic lives. We’re going to:
- Fetch the feature flag for a user
- Determine which variant they see
- Serve the appropriate experience Here’s the updated code:
package main
import (
"fmt"
"log"
"net/http"
"github.com/posthog/posthog-go"
)
var client posthog.Client
func init() {
var err error
client, err = posthog.NewWithConfig(
"<YOUR_POSTHOG_API_KEY>",
posthog.Config{
Endpoint: "https://app.posthog.com",
},
)
if err != nil {
log.Fatal("Failed to initialize PostHog:", err)
}
}
func main() {
defer client.Close()
http.HandleFunc("/", handleHome)
fmt.Println("Server starting on http://localhost:8080")
http.ListenAndServe(":8080", nil)
}
func handleHome(w http.ResponseWriter, r *http.Request) {
distinctID := "placeholder-user-id" // In production, use actual user ID
// Fetch the feature flag
isEnabled, err := client.GetFeatureFlag(
"homepage-button-test",
distinctID,
false,
)
if err != nil {
log.Printf("Error fetching feature flag: %v\n", err)
isEnabled = false
}
// Determine variant text
variantText := "Control: Click here!"
if isEnabled {
variantText = "Test: Try this new button!"
}
// Capture pageview with feature flag info
client.Enqueue(posthog.Capture{
DistinctId: distinctID,
Event: "$pageview",
Properties: posthog.NewProperties().
Set("$feature/homepage-button-test", isEnabled),
})
html := fmt.Sprintf(`
<!DOCTYPE html>
<html>
<head>
<title>A/B Testing Demo</title>
<style>
body { font-family: Arial, sans-serif; margin: 40px; }
.container { max-width: 600px; margin: 0 auto; }
.variant-box {
border: 2px solid #007bff;
padding: 20px;
border-radius: 8px;
margin-top: 20px;
}
.button {
display: inline-block;
padding: 12px 24px;
background-color: #007bff;
color: white;
text-decoration: none;
border-radius: 4px;
cursor: pointer;
font-weight: bold;
}
.button:hover { background-color: #0056b3; }
</style>
</head>
<body>
<div class="container">
<h1>Welcome to A/B Testing</h1>
<div class="variant-box">
<p>%s</p>
<button class="button">%s</button>
</div>
</div>
</body>
</html>
`, "You're in the: "+fmt.Sprintf("%v", isEnabled), variantText)
w.Header().Set("Content-Type", "text/html")
fmt.Fprint(w, html)
}
Notice that crucial line where we add the feature flag to event properties? That tells PostHog which variant produced which event. This is how it calculates whether your test variant actually wins.
Step 5: Handling User Identity (The Tricky Part)
Here’s something that trips up a lot of people: the distinctID should actually represent your user, not be a placeholder. In production, you’d want something like this:
func getDistinctID(r *http.Request) string {
// Try to get user ID from session/auth context
if userID := r.Context().Value("user_id"); userID != nil {
return userID.(string)
}
// For logged-out users, use a cookie-based identifier
if cookie, err := r.Cookie("user_identifier"); err == nil {
return cookie.Value
}
// Generate a new one if nothing exists
newID := generateUserID()
http.SetCookie(w, &http.Cookie{
Name: "user_identifier",
Value: newID,
Path: "/",
MaxAge: 365 * 24 * 60 * 60, // 1 year
})
return newID
}
func generateUserID() string {
// Use something like uuid or a cryptographic random string
return fmt.Sprintf("user_%d", time.Now().UnixNano())
}
This ensures consistent experiment assignment for each user—critical for valid results.
Step 6: Scaling Your Experiments: Best Practices
As your A/B testing program grows, keep these things in mind: Test one variable at a time: Don’t change five things and expect to know which one worked. Isolate your changes. Use proper sample sizes: Don’t run a test on 10 users and call it significant. Use a sample size calculator to determine how many users you actually need. Run your experiments for complete cycles: If your business has weekly patterns, run tests for at least two weeks. Monthly patterns? You know what to do. Validate your implementation: Before trusting results, run an A/A test (same variant for both groups) to ensure your system isn’t introducing bias.
Step 7: Handling Edge Cases and Production Considerations
Real-world A/B testing isn’t always smooth. Here are some gotchas: Feature flag caching: Fetching flags on every request is fine for development but can be slow at scale. Implement caching:
type FlagCache struct {
flags map[string]bool
ttl time.Duration
mu sync.RWMutex
}
func (fc *FlagCache) GetFlag(flagKey, distinctID string) (bool, error) {
fc.mu.RLock()
if val, ok := fc.flags[flagKey]; ok {
fc.mu.RUnlock()
return val, nil
}
fc.mu.RUnlock()
// Fetch and cache
val, err := client.GetFeatureFlag(flagKey, distinctID, false)
if err == nil {
fc.mu.Lock()
fc.flags[flagKey] = val
fc.mu.Unlock()
}
return val, err
}
Ensuring statistical validity: Not all wins are real wins. A difference of 1% when you only ran 100 users? Not significant. Use confidence intervals and statistical power calculations. The “flickering” problem: If you’re doing client-side implementation, users might see the control first, then the variant loads. Prevent this by deciding the variant server-side (which we’re doing here—good job, us).
Real Example: Multi-Variant Testing
Want to get fancy? Test more than two variants:
func handleHome(w http.ResponseWriter, r *http.Request) {
distinctID := getDistinctID(r)
// PostHog handles multi-variant flags too
variant := client.GetFeatureFlagVariant(
"checkout-flow-test",
distinctID,
)
var checkoutFlow string
switch variant {
case "variant-simple":
checkoutFlow = renderSimpleCheckout()
case "variant-detailed":
checkoutFlow = renderDetailedCheckout()
default: // control
checkoutFlow = renderDefaultCheckout()
}
client.Enqueue(posthog.Capture{
DistinctId: distinctID,
Event: "$pageview",
Properties: posthog.NewProperties().
Set("$feature/checkout-flow-test", variant),
})
// ... render checkoutFlow
}
This pattern lets you test three or four variants simultaneously without the statistical complexity that comes with pure multivariate testing.
Deployment Considerations
When you deploy this to production:
- Use environment variables for your PostHog API key (never commit credentials)
- Add proper logging so you can debug flag delivery
- Implement circuit breakers so a slow PostHog API doesn’t slow down your site
- Monitor flag performance with metrics on how long flag fetches take
- Add graceful degradation—if PostHog is down, you should still serve your site (preferring control variant)
func getFeatureFlagWithFallback(client posthog.Client, flag, distinctID string) bool {
ctx, cancel := context.WithTimeout(context.Background(), 200*time.Millisecond)
defer cancel()
done := make(chan bool, 1)
go func() {
result, err := client.GetFeatureFlag(flag, distinctID, false)
if err != nil {
done <- false // fallback to control
} else {
done <- result
}
}()
select {
case result := <-done:
return result
case <-ctx.Done():
return false // timeout, fallback to control
}
}
This pattern ensures your site stays fast even if external dependencies get cranky.
Analyzing Results: Making It Count
After your experiment runs for the appropriate time, here’s what you’re looking for:
- Statistical significance: PostHog will tell you, but generally you want p < 0.05
- Practical significance: A 0.1% improvement might be statistically significant but not worth rolling out
- Effect size: Is the difference meaningful to your business?
- Edge effects: Did anything weird happen in specific user segments? Don’t just trust the dashboard numbers. Ask questions. Did something break during the test period? Did a major traffic source behave differently? Did you ship other features that might have confounded results?
The Workflow: From Hypothesis to Production
Here’s the complete A/B testing workflow in practice:
- Form hypothesis: “Adding more product images increases conversion”
- Implement variant: Code your change
- Create experiment: Set it up in PostHog
- Run test: Let it collect data (statistically significant sample size)
- Analyze results: Check PostHog dashboard
- Decide: Roll out winner, discard loser, or test further
- Monitor: Watch for unexpected side effects post-rollout This cycle might take weeks, but the insights are gold. You learn what actually works for your users, not what you think might work.
Common Mistakes to Avoid
Peeking at results before statistical significance: Stop yourself. This introduces bias. Let the test run. Testing too many things simultaneously: Each additional variant reduces your statistical power. Stick to 2-3. Using biased samples: “Test this feature only on our best customers” is asking the wrong question. Test on representative traffic. Ignoring qualitative feedback: If a variant is statistically winning but users hate it, the metrics might not tell the whole story. Talk to users. Not documenting decisions: Future you will thank present you for explaining why you launched or killed a test.
Monitoring and Maintenance
Once you’re running experiments regularly, set up monitoring:
import "github.com/prometheus/client_golang/prometheus"
var (
flagFetchDuration = prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Name: "flag_fetch_duration_seconds",
},
[]string{"flag_name"},
)
flagFetchErrors = prometheus.NewCounterVec(
prometheus.CounterOpts{
Name: "flag_fetch_errors_total",
},
[]string{"flag_name", "error_type"},
)
)
func getFeatureFlagWithMetrics(flagKey, distinctID string) bool {
start := time.Now()
defer func() {
duration := time.Since(start).Seconds()
flagFetchDuration.WithLabelValues(flagKey).Observe(duration)
}()
result, err := client.GetFeatureFlag(flagKey, distinctID, false)
if err != nil {
flagFetchErrors.WithLabelValues(flagKey, "fetch_error").Inc()
return false
}
return result
}
Track these metrics in your observability platform. Slow flag fetches or frequent errors indicate systemic problems.
Conclusion: You’re Ready
You now have a complete, production-ready A/B testing system in Go. From serving different variants to capturing the right events to analyzing results, you’ve got the full stack. The power of A/B testing isn’t in the technology—it’s in what you learn. Every test teaches you something about your users. Some experiments will surprise you. Some will confirm your assumptions. Some will completely flip your thinking on its head. That’s when you know you’re doing it right. Now go test something. The data awaits.
