Common issues and their solutions for the TendSocial Campaign Architecture v2.0.
Table of Contents
- AI Generation Issues
- Configuration Problems
- A/B Testing Issues
- Performance Problems
- Database Issues
- Job Execution Issues
AI Generation Issues
Issue: "No AI configuration found for task"
Symptom: Error when attempting to generate content
Cause: Missing AIModelConfig entry for the task
Solution:
- Check the database for the task config:
SELECT * FROM "AIModelConfig" WHERE task = 'your_task';- If missing, seed the default config:
cd apps/backend
pnpm run db:seed- Or create manually via admin UI or API:
POST /api/admin/ai-config
{
"task": "social_posts",
"displayName": "Social Posts",
"provider": "anthropic",
"model": "claude-3-5-sonnet-20241022",
"maxTokens": 4096,
"temperature": 0.7,
"inputCostPer1M": 3.0,
"outputCostPer1M": 15.0
}Issue: Generated content quality is poor
Possible Causes:
- Insufficient context (missing brand profile, examples)
- Wrong model for the task
- Prompt needs improvement
Solutions:
Check context:
// Ensure brand profile exists
SELECT * FROM "BrandProfile" WHERE "companyId" = '...';
// Ensure recent posts exist (for examples)
SELECT COUNT(*) FROM "Post"
WHERE "companyId" = '...' AND "published" = true;Try a better model:
PUT /api/admin/ai-config/social_posts
{
"model": "claude-3-opus-20240229", // Higher quality
"inputCostPer1M": 15.0,
"outputCostPer1M": 75.0
}Improve prompts:
- Add more examples to
User.exampleContent - Provide more detailed campaign brief
- Increase context in
Campaign.context
Issue: API rate limit errors
Symptom: 429 Too Many Requests or RateLimitError
Solutions:
Check provider quotas:
- Anthropic: Check your tier at console.anthropic.com
- Google: Check quotas in Cloud Console
- OpenAI: Check usage at platform.openai.com
Implement exponential backoff: Already included in gateway. Check logs for retry attempts.
Distribute load:
- Use multiple API keys and rotate
- Implement BYOK for high-volume companies
Use cheaper models:
- Haiku instead of Sonnet for simple tasks
- Reduce
maxTokensto stay under limits
Configuration Problems
Issue: Config changes not taking effect
Symptom: Model changes don't apply to new requests
Cause: Configuration is cached
Solution:
POST /api/admin/gateway/clear-cacheThe cache TTL is 60 seconds by default. Changes will take effect automatically within a minute, but clearing cache forces immediate update.
Issue: Company override not working
Symptom: Company-specific config is ignored
Debugging:
- Verify the override exists:
SELECT * FROM "CompanyAIConfig"
WHERE "companyId" = '...' AND "task" = '...';- Check if
isEnabledis true:
UPDATE "CompanyAIConfig"
SET "isEnabled" = true
WHERE "companyId" = '...' AND "task" = '...';Check partial overrides: Company configs can be partial. Missing fields fall back to global defaults. This is expected behavior.
Clear cache:
POST /api/admin/gateway/clear-cacheA/B Testing Issues
Issue: Users not being assigned to test
Debugging Checklist:
- Is the test active?
SELECT "isActive" FROM "AIABTest" WHERE id = '...';- Is it within date range?
SELECT "startsAt", "endsAt" FROM "AIABTest" WHERE id = '...';- Does the user match targeting?
SELECT "targetCompanyIds", "targetUserIds" FROM "AIABTest" WHERE id = '...';- Clear cache:
POST /api/admin/gateway/clear-cacheIssue: Uneven variant distribution
Symptom: 100 users but 90/10 split instead of 50/50
Explanation: This is normal with small sample sizes. Random distribution approaches target weights as sample size increases.
Solutions:
- Run test longer (more users)
- Expected variance with 100 users: ±10-15%
- Need 1000+ users for < 5% variance
Issue: Can't change variant after assignment
Symptom: User stuck with same variant
Explanation: This is by design. Consistent assignment ensures valid A/B test results.
If you need to reset:
DELETE FROM "AIABAssignment"
WHERE "testId" = '...' AND "userId" = '...';Then clear cache. User will get new assignment on next request.
Performance Problems
Issue: Slow generation requests
Diagnostic Steps:
- Check latency logs:
SELECT
AVG("latencyMs") as avg_latency,
MAX("latencyMs") as max_latency,
model
FROM "AIUsageLog"
WHERE "createdAt" > NOW() - INTERVAL '1 hour'
GROUP BY model;- Check token counts:
SELECT
AVG("inputTokens") as avg_input,
MAX("inputTokens") as max_input
FROM "AIUsageLog"
WHERE "createdAt" > NOW() - INTERVAL '1 hour';Solutions:
- High input tokens: Reduce context, trim examples
- Long latency: Use faster model (Haiku vs Opus)
- Provider issues: Check status pages
- Network issues: Test with
curlto provider API directly
Issue: High database query times
Diagnostic:
-- Enable slow query logging
ALTER SYSTEM SET log_min_duration_statement = 1000; -- Log queries > 1s
SELECT pg_reload_conf();
-- Check for missing indexes
SELECT schemaname, tablename, attname, n_distinct, correlation
FROM pg_stats
WHERE schemaname = 'public'
ORDER BY n_distinct DESC;Solutions:
- Add indexes on frequently queried columns
- Use
EXPLAIN ANALYZEto identify bottlenecks - Consider materialized views for complex aggregations
Database Issues
Issue: Migration fails
Common Causes:
Existing data conflicts:
- Adding
NOT NULLcolumn to table with data - Unique constraint on existing duplicates
- Adding
Incomplete rollback:
- Previous migration partially applied
Solutions:
- Check migration status:
pnpm prisma migrate status- Resolve manually:
pnpm prisma migrate resolve --applied 20250101000000_migration_name- Reset (development only!):
pnpm prisma migrate resetIssue: RLS (Row-Level Security) violations
Symptom: "No Prisma Client model called X" or missing data
Cause: Trying to query tenant-scoped data without using getTenantPrisma()
Solution:
// ❌ WRONG
const posts = await prisma.post.findMany({ where: { companyId } });
// ✅ CORRECT
const tenantPrisma = getTenantPrisma(companyId);
const posts = await tenantPrisma.post.findMany();Job Execution Issues
Issue: Cron jobs not running
Debugging:
- Check if jobs are enabled:
echo $JOBS_ENABLED- Check cron schedules:
echo $JOB_PROFILE_ANALYSIS # Should be "0 3 * * *"- Check for errors:
SELECT * FROM "ProfileAnalysisJob"
WHERE status = 'failed'
ORDER BY "createdAt" DESC
LIMIT 10;- Check logs:
# For running jobs
tail -f logs/cron.log
# For completed jobs
grep "ProfileAnalysisJob" logs/app.logIssue: Job stuck in "running" status
Cause: Job crashed without updating status
Solution:
- Check for zombie jobs:
SELECT * FROM "ProfileAnalysisJob"
WHERE status = 'running' AND "startedAt" < NOW() - INTERVAL '2 hours';- Reset stuck jobs:
UPDATE "ProfileAnalysisJob"
SET status = 'failed',
"errorMessage" = 'Job timed out'
WHERE status = 'running'
AND "startedAt" < NOW() - INTERVAL '2 hours';- Implement job locking: Already implemented in Phase 8. Check for duplicate job execution.
Cache Issues
Issue: Stale data from cache
Symptoms:
- Old config appearing in responses
- A/B test changes not applying
- Updated company settings not reflected
Solutions:
Manual cache clear:
POST /api/admin/gateway/clear-cacheCheck cache TTL:
echo $AI_CONFIG_CACHE_TTL # Should be 60 (seconds)
echo $AI_GATEWAY_CACHE_TTL # Should be 300 (seconds)Decrease TTL (not recommended):
AI_CONFIG_CACHE_TTL=30 # 30 seconds (more DB load)Logging and Debugging
Enable verbose logging
LOG_LEVEL=debug pnpm devQuery logs efficiently
-- Recent errors
SELECT * FROM "AIUsageLog"
WHERE success = false
AND "createdAt" > NOW() - INTERVAL '1 hour'
ORDER BY "createdAt" DESC;
-- Expensive requests
SELECT
"contentType",
"contentId",
"totalCostCents",
"totalTokens"
FROM "AIUsageLog"
WHERE "totalCostCents" > 50 -- More than 50 cents
ORDER BY "totalCostCents" DESC
LIMIT 20;
-- Slow requests
SELECT
model,
AVG("latencyMs") as avg_latency
FROM "AIUsageLog"
WHERE "createdAt" > NOW() - INTERVAL '24 hours'
GROUP BY model
HAVING AVG("latencyMs") > 5000; -- Slower than 5sGetting Help
If issues persist:
- Check logs: Look for stack traces and error messages
- Search docs: Review architecture plan and implementation checklist
- Check provider status: anthropic.com/status, cloud.google.com/status
- Create issue: Include logs, config, and steps to reproduce
- Contact support: support@tendsocial.com
Useful Commands
# API
pnpm dev # Start dev server
pnpm build # Build production
pnpm test # Run tests
pnpm lint # Type checking
# Database
pnpm prisma studio # GUI for database
pnpm prisma migrate dev # Apply migrations
pnpm prisma generate # Regenerate client
pnpm run db:seed # Seed default data
# Jobs
pnpm run worker # Start job worker
# Cache
curl -X POST http://localhost:4000/api/admin/gateway/clear-cache \
-H "Authorization: Bearer $ADMIN_TOKEN"