Media & News Extraction System

Media Data AI - 2020

Riset.ai's solution for extracting news and articles from epaper and live television. The product in general is divided into two category:

  • Epaper Extraction
  • Live TV News Extraction

For Epaper Extraction, our engines detect the articles location in each page of a given paper edition, and extracts the articles into text and image attachments

As for Live TV News Extraction, we extract the running text on the bottom of the video footage, and also captures spoken news.

Our solution is perfect for those that want to get as much information about a certain news topic, person, or entity. As our engine outputs in machine-encoded text, it can then be further processed or ingested by other pipelines or use cases, such as for data analytics, price monitoring, sentiment analysis, market analysis, and much more.

Key Features:

  1. Article Detection: Detects articles and segments, while retaining as much information about the article, such as Title, Continued-to-Page, Article Writer, etc
  2. Optical Character Recognition (OCR): converts text in image form (bitmap), into machine-encoded text
  3. Speech-to-Text (STT) or Automatic-Speech-Recognition (ASR): is used for recognizing or extracting spoken news into machine-encoded text
  4. Running Text Detection: automatically detects running text on the bottom side of Live TV Channels, and recognizes them into machine-encoded text