Service Logging Best Practices: A Complete Guide to Production-Ready Logging

Introduction

Logging is the backbone of modern application observability. When done right, it transforms debugging from guesswork into systematic problem-solving. When done wrong, it becomes noise that obscures real issues and wastes resources.

After years of managing production systems processing millions of requests, I’ve seen how proper logging can reduce incident resolution time from hours to minutes. This guide focuses on practical strategies that actually work in production environments.

Why Logging Strategy Matters

The Real Cost of Poor Logging

Debugging Time: Poor logs can turn a 10-minute fix into a 3-hour investigation
Incident Response: Without proper context, teams waste time gathering information instead of fixing issues
Compliance Risk: Missing audit trails can result in regulatory violations
Resource Waste: Excessive logging can consume significant infrastructure costs

What Good Logging Delivers

Rapid Problem Resolution: Clear context helps identify root causes quickly
Proactive Issue Detection: Patterns in logs reveal problems before they impact users
Business Intelligence: User behavior insights from application events
Operational Confidence: Teams can deploy and operate systems with visibility

Essential Logging Concepts

Log Levels: Your Information Hierarchy

Log Levels

Think of log levels as a filtering system that helps you find the right information at the right time:

ERROR - Something broke and needs immediate attention

Payment processing failures
Database connection errors
Unhandled exceptions

WARN - Something unusual happened but the system continues

API rate limits approaching
Retry attempts
Deprecated feature usage

INFO - Important business events worth tracking

User logins and logouts
Order completions
System startup/shutdown

DEBUG - Technical details for troubleshooting

Cache hits/misses
SQL query execution
Method entry/exit (use sparingly)

Structured vs Unstructured Logging

Structured vs Unstructured

The difference between good and great logging often comes down to structure:

Unstructured (Hard to Query):

"User john@example.com ordered 3 items totaling $99.99"

Structured (Easy to Query):

{
  "message": "Order completed",
  "userId": "john@example.com",
  "itemCount": 3,
  "total": 99.99,
  "timestamp": "2024-12-20T10:30:15Z"
}

With structured logs, you can easily answer questions like:

“Show me all orders over $50 from the last hour”
“Which users are placing the most orders?”
“What’s our average order value by day?”

What Information Should You Log?

Critical Business Events

These are the events that matter to your business and operations team:

User Actions:

Login/logout events with user context
Feature usage and engagement metrics
Purchase completions and failures
Account changes and security events

System Operations:

Application startup and shutdown
Configuration changes
Scheduled job executions
External service integrations

Performance Indicators:

Response times for critical operations
Database query performance
Cache hit/miss ratios
Resource utilization patterns

Essential Context Information

Every log entry should include enough context to be actionable:

Request Context:

Correlation/Request ID for tracing
User ID (when available)
Session information
Client IP and user agent

Operational Context:

Service name and version
Environment (dev/staging/prod)
Server/container identifier
Timestamp with timezone

Business Context:

Operation being performed
Key business entities (order ID, product ID)
Transaction amounts or quantities
Success/failure indicators

Simple Implementation Example

1. Logger Setup (Program.cs)

// Configure Serilog in Program.cs
Log.Logger = new LoggerConfiguration()
    .MinimumLevel.Information()
    .Enrich.FromLogContext()
    .Enrich.WithProperty("Service", "OrderService")
    .WriteTo.Console()
    .WriteTo.File("logs/app-.log", rollingInterval: RollingInterval.Day)
    .CreateLogger();

builder.Host.UseSerilog();

2. Logger Helper Class

public class StructuredLogger
{
    private readonly ILogger _logger;
    
    public StructuredLogger(ILogger logger)
    {
        _logger = logger;
    }
    
    public void LogInfo(string message, object context = null)
    {
        _logger.LogInformation("{Message} {@Context}", message, context);
    }
    
    public void LogError(string message, Exception error, object context = null)
    {
        _logger.LogError(error, "{Message} {@Context}", message, context);
    }
}

3. Using the Logger in Services

public class OrderService
{
    private readonly StructuredLogger _logger;
    
    public OrderService(StructuredLogger logger)
    {
        _logger = logger;
    }
    
    public async Task<Order> ProcessOrderAsync(OrderData orderData, string userId)
    {
        var correlationId = Guid.NewGuid().ToString();
        
        _logger.LogInfo("Processing order started", new
        {
            CorrelationId = correlationId,
            UserId = userId,
            ItemCount = orderData.Items.Count
        });
        
        try
        {
            var result = await CreateOrderAsync(orderData);
            
            _logger.LogInfo("Order processed successfully", new
            {
                CorrelationId = correlationId,
                OrderId = result.Id,
                Total = result.Total
            });
            
            return result;
        }
        catch (Exception ex)
        {
            _logger.LogError("Order processing failed", ex, new
            {
                CorrelationId = correlationId,
                UserId = userId
            });
            throw;
        }
    }
}

Centralized Logging: Bringing It All Together

Why Centralize Your Logs?

When you have multiple services, centralized logging becomes essential:

Single Source of Truth: All logs in one searchable location Cross-Service Correlation: Follow a user request across multiple services Unified Alerting: Set up alerts across your entire system Simplified Operations: One place to search, one place to monitor

Getting Started with Centralized Logging

Step 1: Choose Your Stack Start simple. If you’re on AWS, use CloudWatch. If you’re using Docker, try the ELK stack with Docker Compose.

Step 2: Standardize Your Log Format Ensure all services use the same JSON structure:

{
  "timestamp": "2024-12-20T10:30:15Z",
  "level": "INFO",
  "service": "order-service",
  "message": "Order processed successfully",
  "correlationId": "abc-123-def",
  "userId": "user-456",
  "orderId": "order-789"
}

Step 3: Implement Correlation IDs

Middleware Setup

public class CorrelationIdMiddleware
{
    private readonly RequestDelegate _next;
    
    public CorrelationIdMiddleware(RequestDelegate next)
    {
        _next = next;
    }
    
    public async Task InvokeAsync(HttpContext context)
    {
        var correlationId = context.Request.Headers["X-Correlation-ID"]
            .FirstOrDefault() ?? Guid.NewGuid().ToString();
        
        context.Response.Headers.Add("X-Correlation-ID", correlationId);
        
        using (LogContext.PushProperty("CorrelationId", correlationId))
        {
            await _next(context);
        }
    }
}

Register Middleware

// In Program.cs
app.UseMiddleware<CorrelationIdMiddleware>();

Security and Sensitive Data

Never Log These Items

Personal Information:

Credit card numbers, CVV codes
Social Security Numbers
Passwords or password hashes
API keys and tokens
Personal addresses and phone numbers

Business Sensitive Data:

Internal pricing information
Proprietary algorithms or business logic
Customer financial details
Confidential business metrics

Safe Logging Practices

Data Sanitization Helper

public class DataSanitizer
{
    private readonly HashSet<string> _sensitiveFields = new()
    {
        "password", "creditCard", "ssn", "token", "secret", "key"
    };
    
    public object SanitizeUserData(UserData userData)
    {
        return new
        {
            UserId = userData.Id,
            Email = MaskEmail(userData.Email),
            Action = userData.Action
            // Exclude sensitive fields
        };
    }
    
    private string MaskEmail(string email)
    {
        if (string.IsNullOrEmpty(email)) return email;
        var parts = email.Split('@');
        if (parts.Length != 2) return email;
        
        return $"{parts[0].Substring(0, 3)}***@{parts[1]}";
    }
}

Usage Example

var sanitizer = new DataSanitizer();
_logger.LogInformation("User action completed {@User}", 
    sanitizer.SanitizeUserData(user));

Compliance Considerations:

GDPR: Be careful with EU user data
PCI DSS: Never log payment card information
HIPAA: Healthcare data requires special handling
SOX: Financial data has strict requirements

Monitoring and Alerting

Key Metrics to Monitor

Error Rates:

Overall error percentage
Errors by service/endpoint
Error trends over time
Critical vs non-critical errors

Performance Indicators:

Response time percentiles (p50, p95, p99)
Throughput (requests per minute)
Database query performance
External service response times

Business Metrics:

User signup/login rates
Transaction completion rates
Feature adoption metrics
Revenue-impacting events

Setting Up Effective Alerts

Alert on Patterns, Not Individual Events:

❌ Alert on every single error
✅ Alert when error rate > 5% for 5 minutes

Use Meaningful Thresholds:

❌ Alert when response time > 100ms
✅ Alert when p95 response time > 2 seconds for 10 minutes

Include Context in Alerts:

{
  "alert": "High Error Rate",
  "service": "payment-service",
  "current_rate": "8.5%",
  "threshold": "5%",
  "duration": "7 minutes",
  "runbook": "https://wiki.company.com/payment-service-errors"
}

Simple Log-Based Queries

Find Payment Failures:

level:ERROR AND service:payment AND message:*failed*

Track User Journey:

correlationId:"abc-123-def" | sort timestamp

Monitor API Performance:

message:*response_time* AND response_time:>2000

Performance Optimization Tips

Async Logging Configuration

Log.Logger = new LoggerConfiguration()
    .WriteTo.Async(a => a.File("logs/app.log"))
    .WriteTo.Async(a => a.Console())
    .CreateLogger();

Log Sampling for High Volume

public class SamplingLogger
{
    private readonly ILogger _logger;
    private readonly Random _random = new Random();
    
    public void LogWithSampling(LogLevel level, string message, object context = null)
    {
        // Always log errors and warnings
        if (level >= LogLevel.Warning)
        {
            _logger.Log(level, message, context);
            return;
        }
        
        // Sample debug/info logs
        var samplingRate = level == LogLevel.Information ? 0.1 : 0.01;
        
        if (_random.NextDouble() < samplingRate)
        {
            _logger.Log(level, message, context);
        }
    }
}

Structured Data: Use structured logging to make queries faster and more reliable than text-based searches.

Common Pitfalls and How to Avoid Them

Over-Logging

Problem: Logging every method entry/exit or minor operations Solution: Focus on business events and error conditions

Under-Logging

Problem: Not enough context when issues occur Solution: Include correlation IDs, user context, and operation details

Inconsistent Formats

Problem: Different services use different log structures Solution: Establish team-wide logging standards and templates

Performance Impact

Problem: Synchronous logging slowing down applications Solution: Use asynchronous logging libraries and consider sampling

Security Violations

Problem: Accidentally logging sensitive data Solution: Implement data sanitization and regular log audits

Implementation Roadmap

Phase 1: Foundation (Week 1-2)

Standardize Log Levels: Define when to use each level across your team
Implement Structured Logging: Convert string concatenation to structured format
Add Correlation IDs: Implement request tracing across services
Security Review: Audit existing logs for sensitive data exposure

Phase 2: Centralization (Week 3-4)

Choose Your Stack: Select centralized logging solution (ELK, cloud, etc.)
Set Up Collection: Configure log forwarding from all services
Create Dashboards: Build basic monitoring views
Define Retention: Establish log storage and cleanup policies

Phase 3: Monitoring (Week 5-6)

Key Metrics: Identify critical business and technical metrics
Alert Rules: Create actionable alerts with proper thresholds
Runbooks: Document response procedures for common issues
Test Alerts: Verify alert delivery and response procedures

Phase 4: Optimization (Ongoing)

Performance Tuning: Implement async logging and sampling
Cost Management: Monitor and optimize log storage costs
Team Training: Educate developers on logging best practices
Continuous Improvement: Regular review and refinement

Quick Reference Guide

Essential Do’s and Don’ts

✅ Always Do:

Use structured logging (JSON format)
Include correlation IDs for request tracing
Sanitize sensitive data before logging
Log business events and errors with context
Set up centralized log collection
Create actionable alerts, not noise

❌ Never Do:

Log passwords, credit cards, or personal data
Use string concatenation for log messages
Over-log (every method entry/exit)
Ignore logging performance impact
Create logs without sufficient context
Mix different log formats across services

Implementation Checklist

Week 1-2: Foundation

Standardize log levels across team
Implement structured logging format
Add correlation ID to all requests
Audit logs for sensitive data exposure

Week 3-4: Centralization

Set up centralized logging system
Configure log forwarding from all services
Create basic monitoring dashboards
Define log retention policies

Week 5-6: Monitoring

Identify key business and technical metrics
Create actionable alert rules
Document incident response procedures
Test alert delivery and escalation

Ongoing: Optimization

Monitor logging performance impact
Optimize log storage costs
Train team on logging best practices
Regular review and improvement

Conclusion

Good logging is invisible when everything works and invaluable when things break. It’s the difference between spending 10 minutes fixing an issue and spending 3 hours trying to understand what went wrong.

The key is to start simple and iterate. You don’t need a perfect logging system from day one. Begin with structured logging and correlation IDs, then gradually add centralized collection, monitoring, and alerting as your needs grow.

Remember: logs are for humans. Make them readable, searchable, and actionable. Your future self (and your teammates) will thank you when you’re troubleshooting a production issue at 2 AM.

The investment in proper logging pays off quickly in reduced debugging time, faster incident resolution, and increased confidence in your systems. Start today, and build the observability foundation your applications deserve.