Project

Liquid News

The landscape in which society interacts with news has evolved due to the advent of the internet and modern communication platforms. Although this evolution has led to greater diversity and accessibility of news media, it has also created challenges regarding selective news coverage, bias, and fake news. This work proposes a novel news platform called Liquid News that aims to enhance people’s understanding of news by leveraging machine-learning-based analysis and semantic navigational aids. 

Background

Over decades, news and how we interact with information have significantly evolved. This evolution has led to greater accessibility and connectivity with the rise of the internet. Still, it also has strengthened or introduced negative factors such as bias and fake news. This has culminated in a world where the ability to access and share information has become readily available; however, truly understanding the news and global events has become far more obscured due to the rampant rise of bias and fake news.

The intended end product of Liquid Movies News is an interface that allows users to parse news via a semantic-relational model that leverages the latent connection between news segments to garner a better understanding of the news at hand. For example, Queen Elizabeth II's death was heavily covered in the news/media. The information was focused on her death and related topics such as royal success, British history, the monarchy's wealth, etc. These topics relate to the Queen's death on a latent semantic-relational level and are essential to understanding her death's significance. However, these topics were covered across multiple news mediums and at varying depths, making it hard to identify and understand these latent connections. Liquid Movies aims to build an interface that uses machine learning to identify the key topics and parse, group, and relate news segments from many news sources, hopefully uncovering these latent relationships and promoting a greater understanding of the news and media around us. 

System Overview

Importer Module

The first module in the Liquid News backend pipeline is the importer. This module retrieves the metadata from the channels defined in the MongoDB database. The module makes a REST API call to the YouTube Data API for each channel in the channel collection. It requests the metadata for the max_results most recent videos, where max_results is a user-defined argument. 

Transcription Module

The downstream modules in the pipeline rely on accurate transcription and captioning of every video. Initially, the transcription module relied on captioning provided by the YouTube Data API to generate transcriptions and sentence-level captioning. However, this failed due to unreliable and error-prone captioning. Instead, the transcription module utilizes the SoTA Conformer speech-to-text model that builds on the well-known Conformer architecture provided by AssemblyAI. 

Topic Extraction & Assignment

The topic module is responsible for assigning each video in the metadata collection a topic. The list of topics that can be assigned to each video is the smallest clustering that expresses the latent clustering of all the video titles. We leverage the GPT-4 large language model to identify this set of topics. Rather than building or fine-tuning an existing language model, the topic module uses prompting as a form of few-shot learning to solve the clustering tasks. Recent work signals that few-shot learning is more robust than fine-tuning.   For each video in the collection, we formulate the following problem using natural language.  

Given a collection of videos 𝑉 = [𝑣1, . . . , 𝑣𝑛], assign each video 𝑣𝑖 to a cluster from an unknown collection 𝐶= [𝑐1, . . . , 𝑐𝑘 ], where |𝐶 | is not known a priori. The solution to this task will be 𝑘 = |𝐶| and a |𝑉 | x |𝐶| binary matrix 𝐴, such that 𝐴𝑖𝑗 represents if video 𝑣𝑖 belongs to cluster 𝑐𝑗. 

Segmentation Module 

The segmentation module solves the following segmentation tasks:

Given a video 𝑉 = ⟨𝑇,𝑆 = ∅⟩, and a collection of consecutive sentences 𝑇 = [𝑇1,...,𝑇𝑁], split 𝑉 into a collection of disjoint segments 𝑆 = [𝑆1,...,𝑆𝑀]. Each segment 𝑆𝑖 = ⟨𝑇𝑖, 𝑙𝑖⟩ contains a sequence of sentences 𝑇𝑖 ⊆ 𝑇 and a subtopic label 𝑙𝑖 that describes the topic of the sentences. 

Following the intuition from the topic module and leveraging the task-agnostic ability of GPT-4, we define the segmentation task as a chain of natural language prompts. 

Subtopic Extraction Module 

The subtopic module solves the clustering tasks at the segment level. The clustering task at the segment level requires solving the task for each topic 𝑇𝑖 to identify subtopics 𝑆𝑖 = [𝑆1𝑖, . . . , 𝑆𝑘𝑖] for all 𝑖 in set 𝑇. 

Summarize and Sort Module

The summarization module generates a bulleted summary for each video clip. The sorting module organizes video clips in a semantic manner, where videos with similar embeddings are organized near each other.