There is no required textbook for this class. However, you may find the texts and resources on this page useful during the semester.
CODING AND SAMPLE Colab notebooks
Basic image operations
Colab Tutorial 1
Colab Tutorial 2
Colab and PyTorch
PyTorch
Basics of PyTorch
PyTorch Tutorial
Deep Learning 60 Minute Blitz with PyTorch
Textbooks
Szeliski, Computer Vision: Algorithms and Applications, 2022 (online draft)
Hartley and Zisserman, Multiple View Geometry in Computer Vision, Cambridge University Press, 2004
Forsyth and Ponce, Computer Vision: A Modern Approach, Prentice Hall, 2002
Palmer, Vision Science, MIT Press, 1999
Goodfellow, Bengio, Courville, Deep Learning, MIT Press, 2016
Mitchel, Machine Learning, McGraw-Hill, 1997
Duda, Hart and Stork, Pattern Classification (2nd Edition), Wiley-Interscience, 2000
Popular Image Datasets
Labelme: an online annotation tool to build image databases for computer vision research
OpenSurfaces: a large database of annotated surfaces created from real-world consumer photographs.
ImageNet: a large-scale image dataset for visual recognition organized by WordNet hierarchy
ADE20K Dataset: a benchmark for scene and instance segmentation, with pixelwise semantic annotations
Places Database: a scene-centric database with 205 scene categories and 2.5 millions of labelled images
NYU Depth Dataset v2: a RGB-D dataset of segmented indoor scenes
Microsoft COCO: a new benchmark for image recognition, segmentation and captioning
Flickr100M: 100 million creative commons Flickr images
Labeled Faces in the Wild: a dataset of 13,000 labeled face photographs
Human Pose Dataset: a benchmark for articulated human pose estimation
YouTube Faces DB: a face video dataset for unconstrained face recognition in videos
UCF101: an action recognition data set of realistic action videos with 101 action categories
HMDB-51: a large human motion dataset of 51 action classes
CelebA: 250,000 faces of celebrities with labeled attributes
FFHQ: 70,000 high-res (1024 x 1024) face images sourced from Flickr