Semantic Search in Internet Videos?
What's in this web page?This page contains a list of features on two benchmarks MED13 and MED14 used in our paper [15], as well as the ranked list returned by our system. The shared data are expected to help:![]() 1) reproduce our state-of-the-art results; 2) benefit related tasks such as video recommendation, hyperlinking and recounting. |
Features:![]()
[0] Junwei Liang, Lu Jiang, Deyu Meng, Alexander Hauptmann. Learning to Detect Concepts from Webly-Labeled Video Data. In IJCAI, 2016. *Please cite the corresponding papers for using our features (32,000 Internet videos).
[1] Y. Miao, F. Metze, and S. Rawat. Deep maxout networks for low-resource speech recognition. In ASRU, 2013. [2] S.-I. Yu, L. Jiang, Z. Xu, et al. CMU-informedia@TRECVID 2014. In TRECVID, 2014. [3] L. Jiang, D. Meng, S.-I. Yu, Z. Lan, S. Shan, and A. G. Hauptmann. Self-paced learning with diversity. In NIPS, 2014. [4] B. Thomee, D. A. Shamma, G. Friedland, B. Elizalde, K. Ni, D. Poland, D. Borth, and L.-J. Li. The new data and new challenges in multimedia research. arXiv preprint arXiv:1503.01817, 2015. [5] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. Large-scale video classification with convolutional neural networks. In CVPR, 2014. [6] P. Over, G. Awad, M. Michel, J. Fiscus, G. Sanders, W. Kraaij, A. F. Smeaton, and G. Qu´eenot. TRECVID 2014 – an overview of the goals, tasks, data, evaluation mechanisms and metrics. In TRECVID, 2014. [7] S.-I. Yu, L. Jiang, and A. Hauptmann. Instructional videos for unsupervised harvesting and learning of action examples. In MM, 2014. [8] H. Wang and C. Schmid. Action recognition with improved trajectories. In ICCV, 2013. |
Retrieved Ranked List:*The ranked list are specified in NIST's standard csv format (http://www.nist.gov/itl/iad/mig/med14.cfm).
|
Published Results on the MED13Test dataset:
[9] A. Habibian, T. Mensink, and C. G. Snoek. Composite concept discovery for zero-shot video event detection. In ICMR, 2014. [10] M. Mazloom, X. Li, and C. G. Snoek. Few-example video event retrieval using tag propagation. In ICMR, 2014. [11] L. Jiang, T. Mitamura, S.-I. Yu, and A. G. Hauptmann. Zero-example event search using multimodal pseudo relevance feedback. In ICMR, 2014. [12] H. Lee. Analyzing complex events and human actions in” in-the-wild” videos. In UMD Ph.D Theses and Dissertations, 2014. [13] S. Wu, S. Bondugula, F. Luisier, X. Zhuang, and P. Natarajan. Zero-shot event detection using multi-modal fusion of weakly supervised concepts. In CVPR, 2014. [14] L. Jiang, D. Meng, T. Mitamura, and A. G. Hauptmann. Easy samples first: Self-paced reranking for zero-example multimedia search. In MM, 2014. [15] L. Jiang, S.-I Yu, D. Meng, T. Mitamura, A. G. Hauptmann. Bridging the Ultimate Semantic Gap: A Semantic Search Engine for Internet Videos. In ICMR 2015. |
Recommandations for building a state-of-the-art system[15]:
|
Screenshot of our Prototype System [16]:![]() [16] S. Xu, H. Li, X. Chang, S.-I. Yu, X. Du, X. Li, L. Jiang, Z. Mao, Z. Lan, S. Burger, and A. Hauptmann. Incremental multimodal query construction for video search. In ICMR, 2015. |
Citation:Lu Jiang, Shoou-I Yu, Deyu Meng, Teruko Mitamura, Alexander Hauptmann. Bridging the Ultimate Semantic Gap: A Semantic Search Engine for Internet Videos.In ACM International Conference on Multimedia Retrieval (ICMR). 2015. [BibTex | supplementary materials] (C) COPYRIGHT 2015, Carnegie Mellon University All Rights Reserved. |