Learning a Tree of Metrics
with Disjoint Visual Features
Sung Ju Hwang, Kristen Grauman and Fei Sha
The University of Texas at Austin
University of Southern California
Motivation
A semantic taxonomy is a generalization-specialization relation graph among categories, that can related each visual feature with different semantic granularity.Idea
Leveraging parent-child relationships in a given semantic taxonomy, we learn a tree of metrics (ToM) that captures compact, discriminative visual features for each level.Approach
Tree of Metrics (ToM)
Given a tree T, we want to learn a metric M_t for each internal (superclass) node to discriminate between its subclasses. -> minimize a distance metric learning objective with large-margin constraints [1].Regularization terms to learn compact discriminative metrics
Local level (sparsity regularization) | Global level (disjoint regularization) |
Apply trace-norm based regularization at each node. Promotes competition between features in a single metric. | Apply disjoint regularization between nodes. Promotes competition for the features between metrics. |
Optimization problem
The above optimization problem is convex, but nonsmooth due to the large margin constraints. We optimize it using a subgradient solver similar to the one in [2].
Classification with Tree of Metrics
Per-node classification: apply k-nearest neighbor classification on the learned metrics. Hierarchical classification: perform series of per-node classification to determine the next node, recursively from the root to the leaf.Proof of Concept
Synthetic dataset & category hierarchy | ToM | ToM + sparsity | Tom + disjoint |
Visual recognition experiments
Datasets
Animals with Attributes (AWA) | Imagenet Vehicle | |
~30K images | ~26K images | |
50 classes | 20 classes |
Hierarchical multi-class classification accuracy
AWA-ATTR (Predicted Attributes) | Vehicle-20 (PCA projected) | |||
---|---|---|---|---|
Method | Correct label | Semantic similarity | Correct label | Semantic similarity |
Euclidean | 32.4 | 53.6 | 28.5 | 56.1 |
Global LMNN [1] | 32.5 | 53.9 | 29.7 | 53.6 |
Multi-metric LMNN [1] | 32.3 | 53.7 | 30.0 | 57.9 |
ToM w/o disjoint sparsity | 36.8 | 58.4 | 31.2 | 60.7 |
ToM + sparsity | 37.6 | 59.3 | 32.1 | 62.7 |
ToM + disjoint | 38.3 | 59.7 | 32.8 | 63.0 |
Attributes selected from the AWA-ATTR dataset
Related
Comparison with orthogonal transfer [3].Orthogonal Transfer | Tree of Metrics |
---|---|
Orthogonality does not necessarily imply disjoint features | Learns true disjoint features |
The convexity of the regularizer depends critically on tuning the weight matrix K | Regularizers are convex |
References
[1] K. Q. Weinberger, J. Blitzer and L. K. Saul, Distance Metric Learning for Large Margin Nearest Neighbor Classification, NIPS, 2006[2] Y. Ying, K. Huang and C. Campbell, Sparse Metric Learning via Smooth Optimization, NIPS, 2009
[3] D. Zhou, L. Xiao and M. Wu, Hierarchical Classification via Orthogonal Transfer, ICML, 2011
Source codes and data
[tom.tar.gz] (183Mb) (v0.9) Contains matlab codes for ToM, the data, and other utilities for taxonomiesC++ implementation with matlab interface (v1.0) will be released soon.
Publication
Learning a Tree of Metrics with Disjoint Visual FeaturesSung Ju Hwang, Fei Sha and Kristen Grauman
Advances in Neural Information Processing System (NIPS),
Granada, Spain, December 2011