Francisco Romero - Teaching

Teaching

CS 8803: Large Scale & Real-Time Visual Analysis

Monday & Wednesday, 12:30 pm–1:45 pm

Howey Physics | Room S107

Course Description

The rapid growth of visual data—from video streaming, augmented reality, autonomous systems, surveillance, and wearable devices—has created unprecedented challenges in large-scale and real-time visual analysis. Effectively processing this data requires a combination of cloud-based infrastructure and edge computing, balancing scalability with low-latency decision-making. You will explore industry and academic advancements in cloud and edge-based visual processing, covering topics such as real-time inference, distributed computing, compression techniques, and AI-driven analytics. You will read and lead discussions on key papers and collaborate on a semester-long project in groups of 2-3, tackling real-world challenges in scalable and real-time visual analysis.

Prior Knowledge

Familiarity with computer systems and databases at an undergraduate level
Knowledge of AI at the application level, and systems for AI recommended but not required

Course Goals

Understand recent developments at the intersection of AI & visual analysis—both in the cloud and on the edge
Read, analyze, and discuss recent developments from academic papers in visual analysis
Work on a term-long research project: from problem formulation to final presentation

Tentative Course Structure

Prior to the start of each class, students will submit responses to 1-2 quiz questions about the assigned readings. Each class, 1-2 students will give an overview of the assigned paper, then leading a discussion around open questions, future work, and related topics. One student will be assigned as a note-taker for the course on a rotating basis.

Tentative Assessment Components

60%: Term project
- 5%: Proposal
- 10%: Mid-term presentation
- 10%: Mid-term report
- 15%: Final presentation
- 20%: Report

20%: Paper presentations
15%: Quizzes (1 quiz will be dropped)
5%: Participation (note-taking, class discussion)

Course Policies

Academic Integrity and Honesty

Georgia Tech aims to cultivate a community based on trust, academic integrity, and honor. Students are expected to act according to the highest ethical standards. Any student suspected of plagiarizing an assignment or presentation, or cheating on a quiz will be reported to the Office of Student Integrity.

To avoid plagiarism:

Attribute any words or ideas from a public source
Attribute any AI system you used and how you used it (e.g. brainstorming or rephrasing)
Don’t directly copy-paste sentences from a classmate or AI system

Accommodations for Students with Disabilities

Contact the Office of Disability Services as soon as possible to discuss your learning needs and to obtain an accommodations letter. Please also e-mail the instructor to set up a time to discuss special needs.

Inclement Weather and Digital Learning Days

For a Digital Learning Day, all class activities will be held remotely via video conferencing.

Tentative Schedule

Week 1: Introduction

Monday: Course Overview

Wednesday: How AI is Transforming Video Surveillance Analytics

Week 2: Video Query Optimization

Monday: BlazeIt: Optimizing Declarative Aggregation and Limit Queries for Neural Network-Based Video Analytics

Wednesday: FiGO: Fine-Grained Query Optimization in Video Analytics

Week 3: Video Query Optimization (continued)

Monday: Spatialyze: A Geospatial Video Analytics System with Spatial-Aware Optimizations

Wednesday: Optimizing Video Analytics with Declarative Model Relationships

Week 4: AI Inference Systems

Monday: ServerlessLLM: Low-Latency Serverless Inference for Large Language Models

Wednesday: Nexus: A GPU Cluster Engine for Accelerating DNN-Based Video Analysis

Week 5: AI Inference Systems (continued)

Monday: Orion: Interference-aware, Fine-grained GPU Sharing for ML Applications

Wednesday: Towards Efficient Large Multimodal Model Serving

Week 6: Project Proposal Presentations

Monday: Project Proposal Presentations

Wednesday: Project Proposal Presentations

Friday: Project Proposal Report Due

Week 7: Video Acceleration

Monday: Warehouse-scale video acceleration: co-design and deployment in the wild

Wednesday: vbench: Benchmarking Video Transcoding in the Cloud

Week 8: Dataset curation and labeling systems

Monday: Mixtera: A Data Plane for Foundation Model Training

Wednesday: Guest lecture

Week 9: Project Midterm Presentations

Monday: Project Midpoint Presentations

Wednesday: Project Midpoint Presentations

Week 10: Processing on the Edge

Monday: Moonshine: Speech Recognition for Live Transcription and Voice Commands

Wednesday: TensorFlow Lite Micro: Embedded Machine Learning for TinyML Systems

Week 11: Processing on the Edge (continued)

Monday: CLONE: Customizing LLMs for Efficient Latency-Aware Inference at the Edge

Wednesday: Pocket: ML Serving from the Edge

Week 12: VLMs and Multi-modal models

Monday: CLIP: Learning Transferable Visual Models From Natural Language Supervision

Wednesday: NVILA: Efficient Frontier Visual Language Models

Week 13: Compound AI Systems

Monday: The Shift from Models to Compound AI Systems Towards Resource-Efficient

Compound AI Systems

Wednesday: Guest lecture

Week 14: Project Final Presentations

Monday: Final Presentations

Wednesday: Final Presentations

Week 15: Project Final Presentations (continued)

Monday: Final Presentations

Wednesday: no class

Week 16: Wrap-up

Monday: Course Summary and Feedback

Wednesday: no class

Friday: Final Reports Due

Page updated

Google Sites

Report abuse