Engineering Case Study

Signal Monitor: Commercial Intelligence for the Events Sector

An automated monitoring pipeline that replaces hours of manual industry tracking with a structured five-minute weekly brief. Built for the UK exhibitions and events sector, it monitors over thirty companies across website, RSS, LinkedIn, and industry news sources, extracting commercially relevant signals using AI classification and delivering a formatted intelligence report to Discord each Monday morning.

Type Engineering Case Study

Client Momentum Works

Domain AI · Commercial Intelligence · Automation

Status Live · Weekly

30+

companies monitored

source types per company

5 min

weekly read time

Weekly

Monday delivery

The Problem

In the UK exhibitions sector, the moves that matter tend to happen quietly. A competitor picks up a niche organiser. A new commercial director arrives from a rival. A venue operator announces a show format that overlaps with yours. None of this lands in a single place, and by the time it surfaces in a general news feed, the window for responding has often passed.

Fragmented sources

Relevant signals appear across company newsrooms, industry publications such as Exhibition News, RSS feeds from trade bodies, LinkedIn posts by senior executives, and events calendars. There is no single feed or database that aggregates this at a useful level of specificity.

High noise, low signal

Even when a team subscribes to all the relevant feeds, the volume of content is significant. Most of it is irrelevant: event recaps, generic marketing copy, sponsor announcements, procedural updates. Extracting the material from the noise is itself a time-consuming task.

Manual tracking does not scale

A researcher tracking thirty companies across four or five source types per company is dealing with 120 to 150 distinct data points per week. Keeping that up, every week, without gaps, is not a realistic expectation of any individual. Coverage drifts. Sources get missed. Signals arrive late or not at all.

Time sensitivity

Signals like a new hire, a confirmed acquisition, or a market entry announcement have a short window of relevance. Intelligence that arrives late, or is buried in an unsorted inbox, has diminished value.

Aggregation alone does not solve the problem. The requirement was a system that filtered for commercial relevance, ran without manual intervention on a fixed schedule, and delivered output that needed no further processing before it could be used.

The Solution

The goal was to take humans out of the loop between source and report. That meant building ingestion, classification, and delivery as a single automated pipeline with no manual steps between them.

Automated weekly monitoring

The pipeline runs on a fixed schedule, Monday morning London time, covering all target companies in a single pass, with no manual intervention required.

Multi-source aggregation

For each company, the system pulls from multiple source types: RSS feeds, company websites, LinkedIn posts via search APIs, and events-specific content. Industry news sources are treated as global feeds and scanned for company mentions.

AI-powered signal classification

A two-stage language model pipeline handles extraction and reporting. Stage 1 identifies and categorises signals using strict criteria. Stage 2 synthesises the output into a structured weekly brief.

Structured output and archival

The final report is delivered to Discord in a readable format, with a PDF archive generated alongside it. Operational summaries and source health alerts are posted to dedicated channels.

Architecture

The system is structured as a sequential pipeline with clearly separated concerns: fetch, classify, deduplicate, report.

Company Configuration

30+ Target Companies

Organisers · Venue operators · Media brands · Industry associations · YAML configuration, new companies added without code changes

Source Types

4 Fetcher Types

RSS (feedparser) · Websites (Trafilatura + BeautifulSoup fallback) · LinkedIn (Serper API + scrape fallback) · Events feeds · Plus industry news global sources

Checksum Filtering

Content is checksummed on ingestion; unchanged content is skipped, avoiding redundant processing and unnecessary token usage
Stage 1: Signal Extraction

Structured prompt extracts signals with category, confidence, summary, commercial significance, evidence, and named entities · Low-confidence signals discarded early
Entity-Level Deduplication

Signals matched on company, category, primary entity, and time bucket; multi-source announcements collapse to a single signal · 12-week rolling persistence
Stage 2: Report Generation

Structured signals transformed into a formatted, category-based weekly brief · Synthesis and formatting only, not classification
Output Layer

Discord delivery with chunking and rate-limit handling · PDF generation via ReportLab · Operational logging and source health alerts to dedicated channels

Discord: weekly intelligence brief

Discord: operational summaries + source health

PDF: formatted archive

Signal categories

Stage 1 classifies against predefined categories to ensure consistency across runs: executive appointments, mergers and acquisitions, product launches, and expansion activity. Signals that do not meet confidence thresholds are discarded before persistence.

Scheduling and operation

The pipeline runs on a weekly cadence using Python’s schedule library, managed in production via system services. A dry-run mode supports safe validation before deploying configuration changes. New companies are added via YAML without any code changes.

AI Integration

Language models are used selectively, only where deterministic approaches break down. The signals targeted do not follow predictable linguistic patterns and require contextual judgement; a rules-based classifier would be brittle and expensive to maintain.

Stage 1: Signal extraction

The extraction layer is designed for high selectivity. The prompt enforces strict criteria: signals must be recent, commercially relevant, and actionable. Low-confidence outputs are discarded before persistence. Global industry news sources are processed with company attribution logic, allowing a single feed to contribute signals across multiple monitored organisations.

Stage 2: Report generation

Stage 2 operates on structured data, not raw content. It performs formatting and synthesis only, presenting signals coherently within the weekly brief format, not making classification decisions. This separation keeps each stage’s responsibility narrow and its output predictable.

A multi-provider fallback chain ensures resilience. Providers are selected based on cost, performance, and output reliability; the system degrades gracefully rather than failing silently.

The Journey

The current architecture did not arrive fully formed. Each phase fixed a failure mode in what came before.

Phase 1

Manual monitoring

Manual review of company websites and trade publications. Output depended on who had time that week, which meant inconsistent cadence and gaps in coverage.
Phase 2

Feed aggregation

RSS feeds and newsletters cut the navigation overhead. But the volume of content grew faster than the ability to filter it: more to read, not more to act on.
Phase 3

Web scraping

Scraping extended coverage beyond what feeds provided. Page structure varies, access constraints are real, and getting stable extraction working took iteration. Ingestion eventually settled.
Phase 4

AI classification

AI classification moved the bottleneck. Human review became the exception, only needed when the model flagged low-confidence signals or a source went dark. Coverage across all thirty companies became consistent for the first time.
Phase 5

Accuracy and efficiency refinements

Entity-level deduplication collapsed multi-source announcements into single signals. Checksum-based filtering meant unchanged content skipped processing entirely, cutting token usage and run time.

Challenges and Trade-offs

Challenge	Approach
Signal precision versus coverage. Aggressive extraction increases recall but reduces reliability.	System prioritises high-confidence signals, accepting that some lower-confidence content is excluded. Precision over recall.
Source reliability. Feeds and websites change unpredictably, creating silent coverage gaps.	Failure tracking surfaces gaps early via operational summaries posted to a dedicated Discord channel after each run.
LinkedIn constraints. Direct API access is heavily restricted.	Indirect retrieval via Serper search API with a scrape fallback, introducing dependency on third-party indexing latency.
LLM cost versus quality. Higher-capability models improve extraction but increase operating cost.	Multi-provider fallback chain balances cost and output quality, with providers selected based on performance and reliability per task type.
Batch scheduling. Weekly cadence limits recovery from a failed run to the next scheduled execution.	Weekly execution simplifies operation and monitoring. The trade-off is accepted: the value window for most signals spans days, not hours.

Impact

Signal Monitor runs in production for Momentum Works. Each Monday morning, the pipeline processes over thirty companies across four source types and posts a structured intelligence brief to Discord with no human involved between source pull and report delivery.

What previously took several hours of manual checking takes the pipeline minutes. The brief itself takes under five minutes to read. Teams doing this work by hand were not keeping up with thirty companies at any consistent frequency; now coverage runs every week without gaps.

The rolling deduplication store means the same announcement does not appear twice across consecutive weeks. Archived reports are available for audit. When a source fails, it surfaces in a dedicated Discord channel the same morning, not discovered retroactively when someone notices a company went quiet.

Future Enhancements

Event-to-CRM integration Extending the events capture pipeline to push structured event data into a CRM via API, enabling agencies to automatically track upcoming shows, align outreach with event timelines, and embed market activity directly into commercial workflows.
Historical trend analysis. Extending the signal store to support time-series insights across companies and categories.
Entity resolution. Normalising entity references across signals to improve cross-source aggregation accuracy.
Alert prioritisation. Introducing urgency tiers to surface high-impact signals (such as M&A activity) above the standard weekly cadence.
Feedback loops. Capturing user feedback to refine extraction quality and signal taxonomy over time.
Dashboard. Providing a navigable interface for exploring historical signals beyond weekly reports.
Additional sources. Expanding ingestion to include podcasts, press syndication feeds, and investor relations content.

Tech Stack

Signal Monitor: Intelligence Pipeline

Python feedparser Trafilatura BeautifulSoup Serper API LLM provider cascade ReportLab Discord webhooks schedule

Case Studies

Get in touch

Need a system like this?

The underlying approach, pulling from fragmented sources, classifying for commercial relevance, and delivering on a fixed schedule, applies well beyond the events sector. Any market with enough public signal and not enough time to watch it is a candidate.

If you are tracking competitors manually, or have a market you cannot follow closely enough, get in touch.

Get in touch LinkedIn