Statistics API¶

Retrieve system statistics and analytics about posts, streams, and review progress.

Overview¶

The Statistics API provides aggregated metrics about FenLiu's content curation pipeline. Use these endpoints to monitor system health, track curation progress, and analyze content patterns.

Key Metrics¶

Post Counts: Total, approved, rejected, pending export
Spam Scoring: Distribution of spam scores
Stream Performance: Posts per stream, activity levels
Review Progress: Reviewed vs. unreviewed, approval rates
Export Pipeline: Queue status breakdown

Authentication¶

All Statistics API endpoints require API key authentication via X-API-Key header:

curl -H "X-API-Key: your-api-key" \
  http://localhost:8000/api/v1/stats

See Authentication Guide for details.

Endpoints¶

GET /api/v1/stats¶

Get overall system statistics and metrics.

Request

curl -H "X-API-Key: your-api-key" \
  http://localhost:8000/api/v1/stats

Response (200 OK)

{
  "total_posts": 1245,
  "approved_posts": 892,
  "rejected_posts": 245,
  "unreviewed_posts": 108,
  "total_streams": 12,
  "active_streams": 10,
  "queue_status": {
    "pending": 234,
    "reserved": 5,
    "delivered": 650,
    "error": 3
  },
  "spam_score_distribution": {
    "very_low": 456,
    "low": 289,
    "medium": 178,
    "high": 156,
    "very_high": 166
  },
  "approval_rate": 71.5,
  "last_fetch": "2026-03-03T15:45:00Z",
  "created_at": "2026-02-15T10:00:00Z"
}

Response Fields

Field	Type	Description
`total_posts`	integer	Total posts in database
`approved_posts`	integer	Posts marked as approved
`rejected_posts`	integer	Posts marked as rejected
`unreviewed_posts`	integer	Posts awaiting review
`total_streams`	integer	Total hashtag streams
`active_streams`	integer	Active streams (enabled)
`queue_status`	object	Export queue breakdown by status
`spam_score_distribution`	object	Count by spam level
`approval_rate`	number	Percentage of approved posts
`last_fetch`	string	ISO 8601 timestamp of last fetch
`created_at`	string	ISO 8601 database creation time

GET /api/v1/stats/posts¶

Get detailed post statistics.

Request

curl -H "X-API-Key: your-api-key" \
  http://localhost:8000/api/v1/stats/posts

Response (200 OK)

{
  "total_count": 1245,
  "status": {
    "approved": 892,
    "rejected": 245,
    "unreviewed": 108
  },
  "by_spam_score": {
    "very_low_0_to_25": 456,
    "low_25_to_50": 289,
    "medium_50_to_75": 178,
    "high_75_to_100": 322
  },
  "by_queue_status": {
    "pending": 234,
    "reserved": 5,
    "delivered": 650,
    "error": 3
  },
  "with_media": 678,
  "without_media": 567,
  "by_source": {
    "mastodon.social": 456,
    "fosstodon.org": 234,
    "pixelfed.social": 178,
    "other": 377
  },
  "average_spam_score": 38.5,
  "median_spam_score": 35,
  "posts_fetched_today": 127,
  "posts_reviewed_today": 89,
  "posts_approved_today": 62
}

Response Fields

Field	Type	Description
`total_count`	integer	Total posts in database
`status`	object	Breakdown by approval status
`by_spam_score`	object	Distribution across score ranges
`by_queue_status`	object	Export queue status breakdown
`with_media`	integer	Posts containing media attachments
`without_media`	integer	Posts without attachments
`by_source`	object	Posts per source instance
`average_spam_score`	number	Mean spam score
`median_spam_score`	number	Median spam score
`posts_fetched_today`	integer	Posts fetched in last 24h
`posts_reviewed_today`	integer	Posts reviewed in last 24h
`posts_approved_today`	integer	Posts approved in last 24h

GET /api/v1/stats/streams¶

Get stream-level statistics.

Request

curl -H "X-API-Key: your-api-key" \
  http://localhost:8000/api/v1/stats/streams

Response (200 OK)

{
  "total_streams": 12,
  "active_streams": 10,
  "inactive_streams": 2,
  "streams": [
    {
      "id": 1,
      "hashtag": "python",
      "instance": "mastodon.social",
      "active": true,
      "post_count": 245,
      "approved_count": 189,
      "rejected_count": 34,
      "unreviewed_count": 22,
      "approval_rate": 77.1,
      "last_fetch": "2026-03-03T15:30:00Z"
    },
    {
      "id": 2,
      "hashtag": "django",
      "instance": "fosstodon.org",
      "active": true,
      "post_count": 178,
      "approved_count": 134,
      "rejected_count": 28,
      "unreviewed_count": 16,
      "approval_rate": 75.3,
      "last_fetch": "2026-03-03T14:45:00Z"
    }
  ],
  "total_posts_all_streams": 1245,
  "average_posts_per_stream": 103.75,
  "most_active_stream": {
    "id": 1,
    "hashtag": "python",
    "post_count": 245
  },
  "least_active_stream": {
    "id": 12,
    "hashtag": "rust",
    "post_count": 8
  }
}

Response Fields

Field	Type	Description
`total_streams`	integer	Total streams created
`active_streams`	integer	Currently active streams
`inactive_streams`	integer	Disabled streams
`streams`	array	Per-stream statistics
`total_posts_all_streams`	integer	Total posts across all streams
`average_posts_per_stream`	number	Mean posts per stream
`most_active_stream`	object	Stream with most posts
`least_active_stream`	object	Stream with fewest posts

Per-Stream Fields

Field	Type	Description
`id`	integer	Stream identifier
`hashtag`	string	Hashtag being monitored
`instance`	string	Source instance
`active`	boolean	Is stream active
`post_count`	integer	Total posts from stream
`approved_count`	integer	Approved posts
`rejected_count`	integer	Rejected posts
`unreviewed_count`	integer	Awaiting review
`approval_rate`	number	Percentage approved
`last_fetch`	string	ISO 8601 last fetch time

Examples¶

Python¶

import httpx

api_key = "your-api-key"
base_url = "http://localhost:8000/api/v1"
headers = {"X-API-Key": api_key}

# Get overall statistics
async with httpx.AsyncClient() as client:
    response = await client.get(f"{base_url}/stats", headers=headers)
    stats = response.json()

    print(f"Total posts: {stats['total_posts']}")
    print(f"Approved: {stats['approved_posts']}")
    print(f"Approval rate: {stats['approval_rate']:.1f}%")
    print(f"Pending export: {stats['queue_status']['pending']}")

# Get post statistics
async with httpx.AsyncClient() as client:
    response = await client.get(f"{base_url}/stats/posts", headers=headers)
    post_stats = response.json()

    print(f"Average spam score: {post_stats['average_spam_score']:.1f}")
    print(f"Posts fetched today: {post_stats['posts_fetched_today']}")
    print(f"Posts approved today: {post_stats['posts_approved_today']}")

# Get stream statistics
async with httpx.AsyncClient() as client:
    response = await client.get(f"{base_url}/stats/streams", headers=headers)
    stream_stats = response.json()

    print(f"Active streams: {stream_stats['active_streams']}")
    for stream in stream_stats['streams']:
        print(f"  #{stream['hashtag']}: {stream['post_count']} posts ({stream['approval_rate']:.1f}% approved)")

JavaScript¶

const apiKey = "your-api-key";
const baseUrl = "http://localhost:8000/api/v1";
const headers = { "X-API-Key": apiKey };

// Overall statistics
const stats = await fetch(`${baseUrl}/stats`, { headers })
  .then(r => r.json());

console.log(`Total posts: ${stats.total_posts}`);
console.log(`Approval rate: ${stats.approval_rate.toFixed(1)}%`);
console.log(`Pending export: ${stats.queue_status.pending}`);

// Post statistics
const postStats = await fetch(`${baseUrl}/stats/posts`, { headers })
  .then(r => r.json());

console.log(`Average spam score: ${postStats.average_spam_score.toFixed(1)}`);
console.log(`Reviewed today: ${postStats.posts_reviewed_today}`);

// Stream statistics
const streamStats = await fetch(`${baseUrl}/stats/streams`, { headers })
  .then(r => r.json());

console.log(`Active streams: ${streamStats.active_streams}`);
streamStats.streams.forEach(stream => {
  console.log(`  #${stream.hashtag}: ${stream.post_count} posts`);
});

cURL¶

export API_KEY="your-api-key"
export BASE_URL="http://localhost:8000/api/v1"

# Overall statistics
curl -H "X-API-Key: $API_KEY" "$BASE_URL/stats" | jq '.'

# Post statistics
curl -H "X-API-Key: $API_KEY" "$BASE_URL/stats/posts" | jq '.'

# Stream statistics
curl -H "X-API-Key: $API_KEY" "$BASE_URL/stats/streams" | jq '.'

# Just approval rate
curl -H "X-API-Key: $API_KEY" "$BASE_URL/stats" | jq '.approval_rate'

# Stream with most posts
curl -H "X-API-Key: $API_KEY" "$BASE_URL/stats/streams" | jq '.most_active_stream'

Use Cases¶

Dashboard Widgets¶

Use statistics endpoints to populate dashboard displays:

// Update dashboard cards every 30 seconds
setInterval(async () => {
  const stats = await fetch(`${baseUrl}/stats`, { headers })
    .then(r => r.json());

  updateCard('total-posts', stats.total_posts);
  updateCard('approval-rate', `${stats.approval_rate.toFixed(1)}%`);
  updateCard('pending-export', stats.queue_status.pending);
}, 30000);

Health Monitoring¶

Check system health and alert on anomalies:

# Alert if approval rate drops below threshold
stats = await client.get(f"{base_url}/stats").json()
if stats['approval_rate'] < 50:
    send_alert("Warning: Approval rate dropped below 50%")

# Alert if queue is backing up
if stats['queue_status']['pending'] > 1000:
    send_alert("Warning: Export queue has 1000+ pending posts")

Performance Metrics¶

Track curation pipeline performance:

# Calculate posts per hour
post_stats = await client.get(f"{base_url}/stats/posts").json()
pph = post_stats['posts_fetched_today'] / 24
print(f"Average posts per hour: {pph:.1f}")

# Calculate review rate
review_rate = post_stats['posts_reviewed_today'] / post_stats['posts_fetched_today']
print(f"Review rate: {review_rate:.1%}")

Stream Analysis¶

Identify underperforming or overactive streams:

stream_stats = await client.get(f"{base_url}/stats/streams").json()

# Find streams with low approval rate
for stream in stream_stats['streams']:
    if stream['approval_rate'] < 50:
        print(f"⚠️  #{stream['hashtag']}: Low approval ({stream['approval_rate']:.0f}%)")

# Find inactive streams
for stream in stream_stats['streams']:
    if not stream['active']:
        print(f"💤 #{stream['hashtag']}: Inactive for {days_since_fetch(stream)} days")

Interpretation Guide¶

Approval Rate¶

The percentage of reviewed posts that were approved:

70%+: High-quality stream, good filtering
50-70%: Moderate quality, some spam getting through
<50%: Stream needs attention, possible spam flooding

Spam Score Distribution¶

Posts spread across score ranges (0-25, 25-50, 50-75, 75-100):

Skewed toward 0-25: High-quality content, good moderation
Skewed toward 75-100: Lots of spam, needs more filtering
Balanced 25-75: Mixed content, manual review needed

Queue Status¶

Posts progressing through export pipeline:

Pending: Ready for export, waiting for consumer
Reserved: Currently being processed
Delivered: Successfully exported
Error: Failed export, needs investigation

Performance Tips¶

Cache Results: Statistics change slowly; cache for 5-10 minutes
Pagination: For large streams, paginate stream statistics if available
Batch Queries: Get all stats in single request (e.g., /stats) rather than multiple calls
Time Ranges: Consider adding date filters for historical trends (future enhancement)

Posts API - Query posts with filtering
Streams API - Manage streams
Dashboard - Visual statistics interface
Queue Preview - Export queue monitoring

Best Practices¶

Regular Monitoring: Check statistics daily for trends
Alert on Changes: Monitor approval rate and queue size
Stream Health: Track stream-level metrics for inactive streams
Feedback Loop: Use statistics to adjust filters and detection rules
Archive History: Store periodic snapshots for trend analysis

Statistics API¶

Overview¶

Key Metrics¶

Authentication¶

Endpoints¶

GET /api/v1/stats¶

GET /api/v1/stats/posts¶

GET /api/v1/stats/streams¶

Examples¶

Python¶

JavaScript¶

cURL¶

Use Cases¶

Dashboard Widgets¶

Health Monitoring¶

Performance Metrics¶

Stream Analysis¶

Interpretation Guide¶

Approval Rate¶

Spam Score Distribution¶

Queue Status¶

Performance Tips¶

Related APIs¶

Best Practices¶