Security & Privacy

📖 5 min read 📄 Part 9 of 10

Distributed Task Scheduler - Security and Privacy

Authentication and Authorization

API Authentication

Methods:

1. API Keys:
POST /api/v1/tasks
Authorization: Bearer sk_live_abc123def456

Pros: Simple, stateless
Cons: Key theft risk

2. OAuth 2.0:
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...

Pros: Standard, secure, token expiration
Cons: More complex

3. Service-to-Service (mTLS):
Client certificate authentication
Pros: Very secure, no passwords
Cons: Certificate management

Recommendation: OAuth 2.0 for users, mTLS for services

Task-Level Authorization

RBAC Model:

Roles:
- Admin: Full access (create, read, update, delete, execute)
- Scheduler: Create, read, update tasks
- Executor: Execute tasks only
- Viewer: Read-only access

Permissions:
{
  "user_id": "user_123",
  "role": "scheduler",
  "allowed_task_types": ["cron", "one_time"],
  "allowed_priorities": ["medium", "low"],
  "max_tasks": 1000,
  "max_concurrent_executions": 10
}

Implementation:
class AuthorizationService {
    boolean canCreateTask(User user, Task task) {
        if (user.role == Role.ADMIN) return true;
        if (user.role == Role.VIEWER) return false;
        
        // Check task type
        if (!user.allowedTaskTypes.contains(task.type)) {
            return false;
        }
        
        // Check priority
        if (!user.allowedPriorities.contains(task.priority)) {
            return false;
        }
        
        // Check quota
        if (user.taskCount >= user.maxTasks) {
            return false;
        }
        
        return true;
    }
}

Task Payload Security

Sensitive Data Handling

Problem:
Task payload may contain sensitive data:
- API keys
- Database credentials
- User PII
- Business secrets

Solutions:

1. Encryption at Rest:
class EncryptedPayload {
    String encrypt(String payload) {
        SecretKey key = getEncryptionKey();
        Cipher cipher = Cipher.getInstance("AES/GCM/NoPadding");
        cipher.init(Cipher.ENCRYPT_MODE, key);
        byte[] encrypted = cipher.doFinal(payload.getBytes());
        return Base64.encode(encrypted);
    }
    
    String decrypt(String encrypted) {
        SecretKey key = getEncryptionKey();
        Cipher cipher = Cipher.getInstance("AES/GCM/NoPadding");
        cipher.init(Cipher.DECRYPT_MODE, key);
        byte[] decrypted = cipher.doFinal(Base64.decode(encrypted));
        return new String(decrypted);
    }
}

2. Secrets Management:
// Store secrets separately
{
  "task_payload": {
    "action": "backup_database",
    "database": "production",
    "credentials_ref": "secret://db-credentials"
  }
}

// Retrieve at execution time
String credentials = secretsManager.getSecret("db-credentials");

3. Least Privilege:
- Workers only access needed secrets
- Temporary credentials
- Automatic rotation
- Audit all access

Code Injection Prevention

Risks:
- Arbitrary code execution
- Command injection
- SQL injection
- Script injection

Mitigations:

1. Input Validation:
class PayloadValidator {
    boolean isValid(String payload) {
        // Check size
        if (payload.length() > 1048576) {  // 1MB max
            return false;
        }
        
        // Check for injection patterns
        String[] dangerousPatterns = {
            "eval(", "exec(", "system(", 
            "'; DROP TABLE", "<script>", 
            "$(", "`"
        };
        
        for (String pattern : dangerousPatterns) {
            if (payload.contains(pattern)) {
                return false;
            }
        }
        
        return true;
    }
}

2. Sandboxing:
- Run tasks in containers
- Limited network access
- Read-only filesystem
- Resource limits (CPU, memory)
- Timeout enforcement

3. Allowlist Approach:
- Define allowed task types
- Whitelist allowed operations
- Reject unknown task types
- Explicit approval for new types

Worker Security

Worker Isolation

Container-Based Isolation:
- Each task runs in separate container
- Network isolation
- Filesystem isolation
- Resource limits

Docker Configuration:
docker run \
  --network=none \
  --read-only \
  --memory=2g \
  --cpus=2 \
  --pids-limit=100 \
  task-executor:latest

Benefits:
- Complete isolation
- Prevent lateral movement
- Resource protection
- Easy cleanup

Worker Authentication

Worker Registration:
1. Worker generates key pair
2. Sends public key to scheduler
3. Scheduler signs certificate
4. Worker uses certificate for authentication

Mutual TLS:
- Worker authenticates to scheduler
- Scheduler authenticates to worker
- Encrypted communication
- No password needed

Heartbeat Security:
- Signed heartbeat messages
- Prevent spoofing
- Detect compromised workers
- Automatic deregistration

Audit Logging

Comprehensive Audit Trail

Events to Log:
- Task creation/deletion
- Task execution start/end
- Task cancellation
- Configuration changes
- Worker registration/deregistration
- Authentication attempts
- Authorization failures

Log Format:
{
  "timestamp": "2024-01-08T10:00:00Z",
  "event_type": "task_execution",
  "task_id": "...",
  "worker_id": "worker-1",
  "user_id": "user_123",
  "action": "execute",
  "result": "success",
  "duration_ms": 5000,
  "ip_address": "203.0.113.42"
}

Storage:
- Immutable append-only log
- Long retention (1 year+)
- Indexed for fast queries
- Exported to SIEM

Compliance:
- SOC 2 Type II
- ISO 27001
- GDPR
- HIPAA (if applicable)

Data Privacy

PII in Task Payloads

Problem:
Task payload may contain PII:
- User emails
- Phone numbers
- Addresses
- Personal data

Solutions:

1. Tokenization:
// Replace PII with tokens
{
  "task_payload": {
    "send_email": {
      "recipient": "token:email:abc123",  // Not actual email
      "message": "Your order is ready"
    }
  }
}

// Detokenize at execution
String email = tokenService.detokenize("token:email:abc123");

2. Encryption:
// Encrypt sensitive fields
{
  "task_payload": {
    "user_email": "encrypted:AES256:...",
    "user_phone": "encrypted:AES256:..."
  }
}

3. Minimal Data:
// Store only references
{
  "task_payload": {
    "user_id": 123,  // Reference, not PII
    "action": "send_notification"
  }
}

// Fetch PII at execution time
User user = userService.getUser(123);

GDPR Compliance

Requirements:

1. Right to Access:
   - Export user's scheduled tasks
   - Export execution history
   
2. Right to Erasure:
   - Delete user's tasks
   - Anonymize execution logs
   
3. Data Minimization:
   - Store only necessary data
   - Minimize retention period

Implementation:
class GDPRService {
    void exportUserData(String userId) {
        // Export tasks
        List<Task> tasks = getTasks(userId);
        
        // Export executions
        List<Execution> executions = getExecutions(userId);
        
        // Export as JSON
        return JSON.stringify({
            tasks: tasks,
            executions: executions
        });
    }
    
    void deleteUserData(String userId) {
        // Delete tasks
        deleteTasks(userId);
        
        // Anonymize executions
        anonymizeExecutions(userId);
        
        // Audit log
        auditLog("user_data_deleted", userId);
    }
}

Incident Response

Security Incident Procedures

1. Detection:
   - Monitor audit logs
   - Alert on suspicious activity
   - Automated threat detection

2. Containment:
   - Disable compromised accounts
   - Block malicious workers
   - Pause affected tasks

3. Investigation:
   - Analyze audit logs
   - Identify attack vector
   - Assess damage

4. Recovery:
   - Restore from backups
   - Rotate credentials
   - Apply security patches

5. Post-Mortem:
   - Document incident
   - Update procedures
   - Implement preventive measures

Runbook:
# Disable compromised user
UPDATE users SET enabled = FALSE WHERE user_id = 'user_123';

# Block malicious worker
UPDATE workers SET status = 'blocked' WHERE worker_id = 'worker-1';

# Pause all tasks from user
UPDATE tasks SET enabled = FALSE WHERE created_by = 'user_123';

This comprehensive security approach ensures the task scheduler is protected against threats while maintaining compliance with privacy regulations.