Skip to main content

Platform Architecture

Gloria AI processes real-time news from curated social media sources through a multi-stage AI pipeline, delivering filtered and enriched content via APIs, bots, and web interfaces.


News Processing Pipeline

Each incoming item passes through the following stages in order. Items can be rejected at multiple points, ensuring only high-quality, relevant content reaches end users.

1. Ingestion

Tweets are scraped from curated Twitter lists via Apify and inserted into the database. Basic filters remove items that are too short or contain low-signal keywords before entering the pipeline.

2. Summarization

An LLM generates a headline, context summary, and extracts key information from the raw tweet and any linked content.

3. Librarian (Deduplication)

An LLM compares each item against recent similar items found via embedding similarity and keyword search. Duplicates are identified and removed to prevent redundant coverage. Only genuine story evolution (e.g., rumor to confirmation) passes through.

4. Entity Detection

An LLM extracts structured entities from the item, including tokens/coins, people, organizations, and protocols.

5. Feed Category Classification

A hybrid classifier (keyword scoring + LLM classification) assigns items to one or more of the 19 feed categories. Each category has its own keywords, examples, and scoring thresholds.

6. Curator

An LLM evaluates newsworthiness on a per-category basis. Each category has its own curation rules that define what qualifies as newsworthy for that specific topic. Items that do not meet the threshold are rejected.

7. Multi-Topic Check

An LLM classifies whether the item covers a single coherent topic or multiple unrelated topics. Multi-topic items (e.g., market roundups covering several unrelated stories) are rejected to maintain feed quality.

8. Fact-Check & Enrichment

The item's factual claims are verified and context is rewritten if corrections are needed. Temporal context is added to catch knowledge-cutoff errors.

9. Sentiment Analysis

An LLM-based classifier assigns a sentiment label (bullish, bearish, neutral) and confidence score to each item.

10. Image Detection

Relevant images are identified from the source content and associated with the item for display.


Additional Systems

Recaps

Automated news summaries are generated per category on a configurable schedule. High-frequency categories (e.g., crypto, macro) produce hourly recaps with a 12-hour lookback window. Other categories produce recaps every 8 or 24 hours. Recaps use a two-step process: batch scoring and ranking of all items in the window, followed by summary generation from the top-ranked items.

Podcast Pipeline

Podcast episodes from curated channels are ingested, transcribed via Deepgram, and processed into AI-generated article summaries.

Narrative Clustering

Related news items are periodically clustered into narratives, grouping coverage of the same evolving story across multiple sources and time periods.


Delivery Channels

  • REST API — Historical news data, recaps, and category management with API key authentication. See API Integration.
  • WebSocket API — Real-time push notifications for new items as they complete the pipeline
  • x402 Micropayment API — Pay-per-request access using Coinbase x402 protocol (USDC on Base, no API key required)
  • Telegram Bot — Automated news delivery with category subscriptions and inline commands
  • Discord Bot — Category-based news feeds and bot commands
  • Farcaster Miniapp — Native Farcaster integration via Gloria Web Terminal