PhD Defense: Jaechul Kim, GDC 4.816
PhD Defense: Jaechul Kim
Date: July 22, 2013
Time: 10:00 am
Place: GDC 4.816
Research Supervisor: Kristen Grauman
Title of dissertation: Region Detection and Matching for Object Recognition
Abstract:
In this thesis, I explore region detection and consider its impact on
image matching for exemplar-based object recognition. Detecting
regions is important to provide semantically meaningful spatial cues
in images. My thesis starts from detecting such meaningful regions in
both local and object-level. Then, I leverage geometric cues of the
detected regions to improve image matching for the ultimate goal of
object recognition. More specifically, my thesis considers four key
questions: 1) how can we extract distinctively-shaped local regions
that also ensure repeatability for robust matching? 2) how can
object-level shape inform bottom-up image segmentation? 3) how should
the spatial layout imposed by segmented regions influence image
matching for exemplar-based recognition? and 4) how can we exploit a
hierarchical structure of different levels of local- and
object-regions to improve the accuracy and speed of image matching? I
propose novel algorithms to tackle these issues, addressing
region-based visual perception from low-level local region extraction,
to mid-level object segmentation, to high-level region-based matching
and recognition.
First, I propose a Boundary Preserving Local Region (BPLR) detector to
extract local shapes. My approach defines a novel spanning-tree based
image representation whose structure reflects shape cues combined from
multiple segmentations, which in turn provide multiple initial
hypotheses of the object boundaries. Unlike traditional local region
detectors that rely on local cues like color and texture, BPLRs
explicitly exploit the segmentation that encodes global object shape.
Thus, they respect object boundaries more robustly and reduce noisy
regions that straddle object boundaries. The resulting detector yields
a dense set of local regions that are both distinctive in shape as
well as repeatable for robust matching.
Second, building on the strength of the BPLR regions, I develop an
approach for object-level segmentation. The key insight of the
approach is that objects shapes are (at least partially) shared among
different object categories---for example, among different animals,
among different vehicles, or even among seemingly different objects.
This shape sharing phenomenon allows us to use partial shape matching
via BPLR-detected regions to predict global object shape of possibly
unfamiliar objects in new images. Unlike existing top-down methods, my
approach requires no category-specific knowledge on the object to be
segmented. In addition, because it relies on exemplar-based matching
to generate shape hypotheses, my approach overcomes the viewpoint
sensitivity of existing methods by allowing shape exemplars to span
arbitrary poses and classes.
For the ultimate goal of region-based recognition, not only is it
important to detect good regions, but we must also be able to match
them reliably. A matching establishes similarity between visual
entities (images, objects or scenes), which is fundamental for visual
recognition. Thus, in the third major component of this thesis, I
explore how to leverage geometric cues of the segmented regions for
accurate image matching. To this end, I propose a segmentation-guided
local feature matching strategy, in which segmentation suggests
spatial layout among the matched local features within each region. To
encode such spatial structures, I devise a string representation whose
1D nature enables efficient computation to enforce geometric
constraints. The method is applied for exemplar-based object
classification to demonstrate the impact of my segmentation-driven
matching approach.
While each of individual regions, by itself, provides a strong spatial
cue that constrains geometric layout formatched features, these
regions, when considered as a whole, entail another useful cue, region
hierarchy, that spans different spatial ranges such as local- and
object-level regions. The last part of my thesis studies how to
exploit such a hierarchical structure to improve the image matching.
To this end, I propose a deformable spatial pyramid graphical model
for image matching, where the proposed model considers multiple
spatial extents, from an entire image to grid cells to every single
pixel, to be matched simultaneously. The proposed pyramid model
strikes a balance between robust regularization by larger spatial
supports on the one hand and accurate localization by finer regions on
the other.
Further, the pyramid model is suitable for fast coarse-to-fine
hierarchical optimization. The method is applied to pixel label
transfer tasks for semantic image segmentation, improving upon the
state-of-the-art in both accuracy and speed.
Throughout, I provide extensive evaluations on challenging benchmark
datasets, validating the effectiveness of my approach. The outcome
will realize the promising potential of region-based visual
perception. In addition, all my codes for local shape detector, object
segmentation, and image matching are publicly available, which I hope
will serve as useful new additions for vision researchers’ toolbox.
- About
- Research
- Faculty
- Awards & Honors
- Undergraduate
- Graduate
- Careers
- Outreach
- Alumni
- UTCS Direct