Hao Wooi Lim's blog: 2009

Sunday, October 25, 2009

Table of results for Caltech 256 dataset

This is a table documenting some of the best results some paper obtained in Caltech-256 dataset.

Results shown here are all trained using 30 samples from each category.

Visualizing and Understanding Convolutional Networks (ARXIV 2013)
Cited 14 times. 70.6% ± 0.2%
Multipath Sparse Coding Using Hierarchical Matching Pursuit (CVPR 2013)
Cited 7 times. 50.7%
Additional info: Multipath Hierarchical Matching Pursuit
Link to paper's project page
Learning Subcategory Relevances for Category Recognition (CVPR 2008)
Cited 48 times. 49.5%
Spatially Local Coding for Object Recognition (ACCV 2010)
Cited 1 time. 46.6% ± 0.2%
Additional info: Multi-scale SIFT features extracted every 4 pixels.
Link to paper's project page
Link to paper's source code
On Feature Combination for Multiclass Object Detection (ICCV 2009)
Cited 376 times. 45.8%
Additional info: LP-β
Link to paper's project page (Contains results, source code and pre-computed features)
Image Classification using Random Forests and Ferns (2007)
Cited 412 times. 45.3%
Local Pyramidal Descriptors for Image Recognition (PAMI 2013)
Cited 1 time. 44.86%
Additional info: P-SIFT + Fisher encoding + SPM + Linear SVM
Link to paper's project page (Contains source code and demo)
A Binary Classification Framework for Two-Stage Multiple Kernel Learning (2012)
Cited 5 times. 44.8%
Efficient Learning of Sparse, Distributed, Convolutional Feature Representations for Object Recognition (ICCV 2011)
Cited 21 times. 42.05%
Additional info: CRBM K=4096
In Defense of Nearest-Neighbor Based Image Classiﬁcation (CVPR 2008)
Cited 478 times. 42%
Additional info: NBNN (5 descriptors)
Locality-constrained Linear Coding for Image Classification (CVPR 2010)
Cited 547 times. 41.19%
Local Naive Bayes Nearest Neighbor for Image Classiﬁcation (2011)
Cited 20 times. 40.1%
Sparse Spatial Coding: A Novel Approach for Efficient and Accurate Object Recognition (ICRA 2012)
Cited 9 times. 37.08% ± 0.36%
Caltech-256 object categoriy dataset (2007)
Cited 596 times. 34.1%
Linear spatial pyramid matching using sparse coding for image classification (CVPR 2009)
Cited 713 times. 34.02%
Kernel codebooks for scene categorization (ECCV 2008)
Cited 242 times. 27.17%

Friday, August 21, 2009

Table of results for Caltech 101 dataset

This is a table documenting some of the best results some paper obtained in Caltech-101 dataset.

Results shown here are all trained using 30 samples from each category.

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition (ARXIV 2014)
Cited 191 times. 91.44% ± 0.7%
Link to paper's project page
Visualizing and Understanding Convolutional Networks (ARXIV 2013)
Cited 561 times. 86.5% ± 0.5%
Additional info: Pre-trained with ImageNet
Group-Sensitive Multiple Kernel Learning for Object Categorization (ICCV 2009)
Cited 127 times. 84.3%
Additional Info: GS-MKL
Reference-Based Scheme Combined With K-SVD for Scene Image Categorization (2013)
Cited 2 times. 83%
Multipath Sparse Coding Using Hierarchical Matching Pursuit (CVPR 2013)
Cited 2 times. 82.5% ± 0.5%
Additional Info: Multipath Hierarchical Matching Pursuit
Link to paper's project page
LP-Beta + Geometric blur + PHOW gray/color + Self-Similarity
82.1% ± 0.3%
Learning Subcategory Relevances for Category Recognition (CVPR 2008)
Cited 40 times. 81.9%
Object Recognition as Ranking Holistic Figure-Ground Hypotheses (CVPR 2010)
Cited 68 times. 81.9%
Additional Info: Regression with Post-Processing.
Image Classification using Random Forests and Ferns (ICCV 2007)
Cited 378 times. 81.3%
Additional Info: Bosch Multi-way SVM
Spatially Local Coding for Object Recognition (ACCV 2010)
Cited 0 time. 81% ± 0.2%
Additional Info: Multi-scale SIFT features extracted every 4 pixels.
Link to paper's project page
Link to paper's source code
Link to paper's poster
Local Pyramidal Descriptors for Image Recognition (PAMI 2013)
Cited 1 time. 80.13%
Additional Info: P-SIFT + Fisher encoding + SPM + Linear SVM
Link to paper's project page (Contains source code, demo available)
Sparse Spatial Coding: A Novel Approach for Efficient and Accurate Object Recognition (ICRA 2012)
Cited 7 times. 80.02% ± 0.36%
Additional Info: dictionary size is 4096.
Distance-based Mixture Modeling for Classification via Hypothetical Local Mapping (2013)
Cited 0 times. 80% ± 0.75%
In Defense of Nearest-Neighbor Based Image Classiﬁcation (CVPR 2008)
Cited 442 times. 79.23%
Additional Info: NBNN (5 descriptors)
Smooth Sparse Coding via Marginal Regression for Learning Sparse Representations (2012)
Cited 0 times. 79.11 ± 0.87%
Additional info: dictionary size = 4096
Unsupervised and Supervised Visual Codes with Restricted Boltzmann Machines (ECCV 2012)
Cited 4 times. 78.9% ± 1.1%
Additional info: 1024 codewords trained on macrofeatures with supervised fine-tuning.
Link to paper's poster
Robust Classiﬁcation of Objects, Faces, and Flowers Using Natural Image Statistics (CVPR 2010)
Cited 49 times. 78.5% ± 0.5%
Link to paper's supplemental material
Link to paper's project page
Link to paper's source code (MATLAB)
Visual Geometric Group (VGG)'s implementation of Multiple Kernel Image Classifier trained on dense SIFT, self-similarity, and geometric blur features
78.20% ± 0.4%
Additional Info: Result of 77.8% is obtained by combining dense SIFT, self-similarity, and geometric blur features with the multiple kernel learning
Efficient Learning of Sparse, Distributed, Convolutional Feature Representations for Object Recognition (ICCV 2011)
Cited 19 times. 77.8%
Additional info: CRBM K=4096
On Feature Combination for Multiclass Object Detection (ICCV 2009)
Cited 312 times. 77.8% ± 0.4%
Additional info: LP-β
Link to paper's project page (Contains results, source code and pre-computed features)
Representing shape with a spatial pyramid kernel (CIVR 2007)Cited 547 times. 77.8%
Additional info: Result of 77.8% is obtained by combining all 4 cues (shape 180, shape 360, gray appearance and color appearance.
The devil is in the details - an evaluation of recent feature encoding methods (BMVC 2011)
Cited 91 times. 77.78% ± 0.56%
Additional info: Fisher (FK)
Link to paper's project page (Contains dataset and source code)
Object Recognition with Hierarchical Kernel Descriptors (CVPR 2011)
Cited 29 times. 77.5%
Link to paper's project page (Contains dataset, demos and source code)
Ask the locals: multi-way local pooling for image recognition (ICCV 2011)
Cited 44 times. 77.3% ± 0.6%
Link to paper's supplemental material
A Binary Classification Framework for Two-Stage Multiple Kernel Learning (ICML 2012)
Cited 5 times. 77.2%
Kernel Descriptors for Visual Recognition (NIPS 2010)
Cited 48 times. 76.4% ± 0.7%
Additional info: KDES-A(M)
Local Naive Bayes Nearest Neighbor for Image Classiﬁcation (CVPR 2012)
Cited 17 times. 76% ± 0.9%
Link to paper's associated technical report
Link to paper's source code
Link to paper's project page
Fast approximations to structured sparse coding and applications to object classification (2012)
Cited 2 times. 75.7% ± 1%
Learning mid-level features for recognition (CVPR 2010)
Cited 221 times. 75.7% ± 1.1%
Additional Info: Sparse Codes, Intersection Kernel.
Object and Action Classification with Latent Window Parameters (IJCV 2013)
Cited 0 times. 75.31% ± 0.68%
Beyond Spatial Pyramids: Receptive Field Learning for Pooled Image Features (2012)
Cited 27 times. 75.3% ± 0.7%
Image classiﬁcation with multiple feature (2011)
Cited 3 times. 75% ± 0.8%
In Defense of Soft-assignment Coding (ICCV 2011)
Cited 46 times. 74.2% ± 0.8%
Link to author's web site
Link to paper's source code
Locality-constrained Linear Coding for Image Classification (CVPR 2010)
Cited 446 times. 73.44%
Project web site: Link to Project web site
Source code: Link to MATLAB code (rar)
Linear Spatial Pyramid Matching Using Sparse Coding for Image Classification (CVPR 2009)
Cited 621 times. 73.2% ± 0.54%
Additional Info: Sparse coding, max pooling, linear SVM
Project web site: Link to Project web site
Source code: Link to MATLAB code (rar)
High Dimensional Nonlinear Learning using Local Coordinate Coding (Technical Report 2009)
Cited 3 times. 73.14%
Additional Info: Local coordinate coding, max pooling, linear SVM
Recognition using Regions (CVPR 2009)
Cited 156 times. 73.1%
The importance of Encoding Versus Training with Sparse Coding and Vector Quantization (ICML 2011)
Cited 101 times. 72.6%
Learning Coupled Conditional Random Field for Image Decomposition with Application on Object Categorization (CVPR 2008)
Cited 10 times. 70.38%
Fast Image Search for Learned Metrics (CVPR 2008)
Cited 126 times. 69.6%
Additional info: ML+CORR
A Multi-Scale Learning Framework for Visual Categorization (ACCV 2010)
Cited 2 times. 68.5%
Additional info: sparse coding (K = 900)
Caltech-256 Object Category Dataset (2007)
Cited 544 times. 67.6%
Additional Info: Griffin's SPM
Improved Spatial Pyramid Matching for Image Classification (ACCV 2010)
Cited 3 times. 67.36% ± 0.17%
Variable Sparsity Kernel Learning (JMLR 2011)
Cited 23 times. 67.07%
Deep Learning of Invariant Features via Simulated Fixations in Video (2012)
Cited 2 times. 66%
Additional info: Trained also with video (unrelated to Caltech-101) obtained 74.6%
Similarity-based cross-layered hierarchical representations for object categorization (CVPR 2008)
Cited 47 times. 66.5%
Additional Info: Shapinals
SVM-KNN - Discriminative Nearest Neighbor Classification for Visual Category Recognition (CVPR 2006)
Cited 568 times. 66.2% ± 0.5%
The Hierarchical Beta Process for Convolutional Factor Analysis and Deep Learning (ICML 2011)
Cited 7 times. 65.8% ± 0.6%
Combined Descriptors in Spatial Pyramid Domain for Image Classification (2012)
Cited 0 times. 65.5% ± 0.49%
Bag-of-Features Kernel Eigen Spaces for Classiﬁcation (ICPR 2008)
Cited 2 times. 65.5% ± 0.7%
Image Retrieval and Classification using Local Distance Functions (NIPS 2006)
Cited 136 times. 65.2%
Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations (ICML 2009)
Cited 315 times. 65.4% ± 0.5%
Additional Info: CDBN (ﬁrst+second layers)
Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories (CVPR 2006)
Cited 2582 times. 64.6% ± 0.8%
Additional Info: L=2, M=200, Pyramid
Source code: Link to MATLAB code (zip)
Slides: Link
Kernel Codebooks for Scene Categorization (ECCV 2008)
Cited 224 times. 64.12%
Visual Word Ambiguity (PAMI 2010)
Cited 257 times. 64.1%
Using dependent regions for object categorization in a generative framework (CVPR 2006)
Cited 121 times. 63%
SIFTing the Relevant from the Irrelevant - Automatically Detecting Objects in Training Images (2009)
Cited 0 times. 61.45%
Pyramid Match Kernels: Discriminative Classification with Sets of Image Features (ICCV 2005)
Cited 849 times. 58.2%
Efﬁcient Classiﬁcation for Additive Kernel SVMs (PAMI 2012)
Cited 11 times. 56.59% ± 0.77%
Max-Margin Additive Classiﬁers for Detection (ICCV 2009)
Cited 79 times. 56.49%
Multiclass Object Recognition with Sparse, Localized Features (CVPR 2006)
Cited 345 times. 56%
Efficiently Matching Sets of Features with Random Histograms (2008)
Cited 46 times. 54.1%
Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition (2007)
Cited 235 times. 54%
Fast Inference in Sparse Coding Algorithms with Applications to Object Recognition (2008)
Cited 64 times. 53%
Classification using Intersection Kernel Support Vector Machines is Efficient (2008)
Cited 396 times. 52%
Project web site: Link
Source code: Link to MATLAB/C code (tar.gz)
Object Recognition with Features Inspired by Visual Cortex (2006)
Cited 536 times. 42%

Sunday, April 05, 2009

Re-focused

In an attempt to re-focuses my energy on IT-related stuff that I will be doing, I have been doing some thinking and came up with a list (The list is not final):

Research
(I did not say machine learning, nor pattern recognition, as those are just means to an end)
- Computer Vision
- Natural Language Processing

Development
(I did not say functional programming, as those are just means to an end)
- Writing parallel & concurrent programs (Parallel processing, Concurrent programming)
- Writing maintainable, beautiful code

Design
- Designing simple, practical & nice UI

Friday, March 06, 2009

Return to research

This is a post foreshadowing my come back to the research world. I've been thinking a lot about vision. How do we humans perceive things? How do we recognize things? How do we get a sense of deja vu after seeing things we have seen before?

Even though I do not have the answer to these perplexing questions. I do however think that I *might* have a solution that might solve it to a certain degree. That's a pretty modest statement. But I'm not going to claim anything more exotic before I come up with a working prototype.

The ideas I have in mind have a lot to do with time-based learning, that is, learning not just things but the relationships they have. That is, that certain meaningful things appear to have an order. For example, you do not see a cat walking by, then suddenly it become a dog for half a second, then magically becoming a cat again. Things are not random. They appear in logical order. The idea is nothing new. It has been addressed by Jeff Hawkins in his work on Hierarchical Temporal Memory.

End of part 1. I will talk more about this later.