 |
Jun Ye (叶 俊)
Email: jye@cs.ucf.edu
Department of Computer Science,
University of Central Florida,
4000 Central Florida Blvd,
Orlando, FL, 32816 |
Brief Bio:
I graduated with a Ph.D. degree in computer science from University of Central Florida in 2016. My advisor is Dr. Kien A. Hua. My research interests include multimodal data retrieval and analysis. My current focus is on video hashing and temporal modeling for human action recognition from videos. I am now working as machine learning engineer with Facebook. Prior that, I was a senior data and applied scientist with Microsoft between 2017 to 2020.
Education:
2010.8~2016.10 |
Ph.D. in Computer Science,
School of EECS, Computer Science Dept,
University of Central Florida (UCF), Orlando, FL, USA. |
2007.9~2010.3 |
M.S. in Computer Science,
Department of Pattern Recognition and Artificial Intelligence,
Beihang University (BUAA), Beijing, China. |
2003.9~2007.6 |
B.S. in Automation,
School of Control Science and Engineering,
Huazhong University of Science and Technology (HUST), Wuhan, China. |
Journal Publications:
- Jun Ye, Guo-Jun Qi, Naifan Zhuang, Hao Hu, and Kien Hua. "Learning Compact Features for Human Activity Recognition via Probabilistic First-Take-All." IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 42.1 (2018): 126-139.
- Kai Li, Guo-Jun Qi, Jun Ye, T. Yusuph, and Kien A. Hua. "Semantic Image Retrieval with Feature Space Rankings," International Journal of Semantic Computing. 11(2), June 2017.
- Jun Ye, Hao Hu, Guojun Qi, and Kien A. Hua. "A Temporal Order Modeling Approach to Human Action Recognition from Multimodal Sensor Data," ACM Transactions on Multimedia Computing Communications and Applications (TOMM) [pdf], December, 2016.
- Kai Li, Guojun Qi, Jun Ye and Kien A. Hua. "Linear Subspace Ranking Hashing for Cross-modal Retrieval," IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), September, July 2016.
- Jun Ye and Kien A. Hua, "Octree-based 3D Logic and Computation of Spatial Relationships in Live Video Query Processing," ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Vol 11, issue 2, December 2014.
Conference Publications:
- Naifan Zhuang, Jun Ye and Kien Hua. "Convolutional LSTM for Crowd Scene Understanding." In proceedings of 19th IEEE International Symposium on Multimedia(ISM'17), Taichung, Taiwan, Dec 11-13, 2017.
- Kutalmis Akpinar, Fereshteh Jafariakinabad, Kien A. Hua, Omar Nakhila, Jun Ye and Cliff Zou. "Fault-Tolerant Network-Server Architecture for Time-Critical Web Applications." In proceedings of 15th IEEE International Conference on Dependable, Autonomic and Secure Computing (DASC 2017), Orlando, Florida, Nov 6-10, 2017.
- Naifan Zhuang, Turghun Yusuf, Jun Ye and Kien A. Hua. "Group Activity Recognition with Differential Recurrent Convolutional Neural Networks[pdf]," in preceddings of IEEE Conference on Automatic Face and Gesture Recognition, (FG 2017), Washington DC, May 30-June 3, 2017.
- Kutalmis Akpinar, Trevor Ballard, Kien A. Hua, Kai Li, Sansiri Tarnpradab, and Jun Ye. "COMMIT: A Multimedia Collaboration System for Future Workplaces with the Internet of Things," in proceedings the 8th ACM on Multimedia Systems Conference (MMSys'17), Taipei, June 20-23, 2017.
- Kai Li, Guojun Qi, Jun Ye and Kien A. Hua. "Supervised Ranking Hash for Semantic Similarity Search," in proceedings of IEEE International Symposium on Multimedia (ISM), San Jose, California, December 11-13, 2016.
- Naifan Zhuang, Jun Ye and Kien A. Hua. "DLSTM Approach to Video Modeling with Hashing for Large-Scale Video Retrieval[pdf][ppt]," in proceedings of International Conference on Pattern Recognition (ICPR). Cancun, Mexico. Dec 4-8, 2016.
- Kai Li, Guo-Jun Qi, Jun Ye and Kien A. Hua. "Cross-modal Hashing Through Ranking Subspace Learning," in proceedings of IEEE International Conference on Multimedia and Expo (ICME), Seattle, July 11-15, 2016.
- Jun Ye, Kai Li and Kien A. Hua. "WTA Hash-based Multimodal Feature Fusion for 3D Human Action Recognition," in proceedings of IEEE International Symposium on Multimedia (best paper award of ISM 2015), Miami, December 14-16, 2015.
- Jun Ye, Hao Hu, Kai Li, Guo-Jun Qi, and Kien A. Hua. "First-Take-All: Temporal Order-Preserving Hashing for 3D Action Videos." arXiv preprint arXiv:1506.02184 (2015).
- Kai Li, Guo-Jun Qi, Jun Ye and Kien A. Hua. “Rank Subspace Learning for Compact Hash Codes.” arXiv preprint arXiv:1503.05951 (2015)
- Jun Ye, Kai Li, Guo-Jun Qi and Kien A. Hua, "Temporal Order-Preserving Dynamic Quantization for Human Action Recognition from Multimodal Sensor Streams[ppt]," ACM International Conference on Multimedia Retrieval (ICMR), Shanghai, June 2015.
- Kai Li, Jun Ye, and Kien A. Hua, "What Is Making That Sound?" in Proc. of ACM Multimedia Conference (ACM MM), Orlando, November 3-7, 2014.
- Jun Ye and Kien A. Hua, "Exploiting Depth Camera for 3D Spatial Relationship Interpretation [pdf][slides]," in proceedings of ACM Multimedia Systems (MMSys'13). Oslo, Mar, 2013.
- Jun Ye, Kien A. Hua, "Scalability Study of Wireless Mesh Networks with Dynamic Stream Merging Capability [pdf]," in Proceedings of Multimedia Communications, Services & Security (best presentation paper award), Krakow, Poland, 2011.
- Jun Ye, Lin-Lin Huang, and Xiao-Li Hao, "Neural Network Based Text Detection in Videos Using Local Binary Patterns [pdf]," in Proc. of China, Japan and Korea Joint Workshop on Pattern Recognition (CJKPR), Nanjing, November 04-09, 2009.
Professional Services:
- Program Committee Member, ACM SIGKDD Conference 2016
- Program Committee Member, ACM Multimedia 2016
- Program Committee Member, IEEE BigMM 2017
- Reviewer, International Journal on Multimedia Tools and Applications (MTAP)
- Reviewer, IEEE Transaction on Multimedia (TMM)
- Reviewer, IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)
- Reviewer, International Conference on Computer Vision (ICCV 2017)
- Web and Information Chiar, IEEE International Conference on Cloud Engineering (IC2E 2018)
- Area Chiar, ACM Multimedia 2020
Research Projects:
-
First-Take-All Hashing for Human Activity Recognition from Wearable Devices,
Mar 2016~current
This project is part of my disseration. The temporal dynamics inside human activity data captured from wearable devices can be explored to distinguish between different activity categories. A probabilistic First-Take-All Hash technique is proposed to encode the temporal dynamics within human activity sequences for daily activity recognition. Work is submitted to an IEEE Journal and is under review.
-
Temporal Hashing for Human Action Sequence Retrieval,
Aug 2014~2016
This project is part of my disseration. Most current hashing algorithms are designed for fixed-length data such as images and focus on the spatial representation of data. As a result, it is still challenging and expensive to hash videos. I investigate a temporal hashing algorithm which encodes the video by the temporal order of randomly generated latent patterns in a video sequence. This is the first work that hashes a video by their temporal information. The work is accepted by ACM Transactions on Multimedia Computing Communications and Applications (TOMM).
-
3D Human Action Recognition,
June 2014~Current
In this project, I investigate the challenge of temporal modeling of human action recognition and propose a dynamic quantization algorithm to model the dynamic patterns of the action sequence for 3D human action recognition. I also propose a novel hash-based multimodal feature fusion algorithm which is a generic early fusion method independent of features. Two conference papers are published including a best paper award in IEEE ISM2015.
-
Live Video Computing,
Nov. 2012~Nov.2014
This project is to investigate techniques for the interpretation of 3D spatial relationships for the LVDBMS (Live Video DataBase Management System). The LVDBMS is a
general-purpose framework for managing and processing live video data for surveillance and analytical applications. This
system allows for automatic event recognition over a network of live cameras. The user is able to specify a monitoring task
by formulating a query describing a spatiotemporal event. This query may be formulated as a combination of logical,
spatial, and temporal operators. When the specified event is observed, an action associated with the query is triggered. In
other words, the LVDBMS treats a camera as a special class of storage, and processes the continuous queries against live
video feeds, and is analogous to a new category of databases. In this project, I extend the original 2D spatial operators in the current LVDBMS into 3D spatial operators by using the Microsoft Kinect sensors. An octree-based algorithm for computing the 3D spatial operators is proposed and a GPU-based implementation is developed. Language: C++, OpenCL; Platform: Windows; Contribution: Algorithm design and simulation, prototype implementation, and Analysis; a conference paper published at ACM MMSys'13 and a journal paper published at ACM TOMCCAP.
-
Intelligent Traffic Surveillance System,
Sept. 2014~Current
Design and develop an intelligent traffic surveillance system to monitor and analyze traffic flows in the highway. Algorithm module include components as motion detection, tracking, vehicle classification and traffic flow analysis. Initial version is delivered. I designed and develop the whole algorithm module. Team members include two PhDs and two master students. System is built in Qt 5.1. Algorithm module is written in C++. Use OpenCV, OpenCL and Libsvm.
-
Detection of Text Objects in Video Images,
Nov. 2008~Dec. 2009
The project aimed at detecting and locating texts in images. The detected results are recognized by OCR and used for image and video indexing and retrieval. It is also my thesis for the master's degree. In the project I was focused on the texts detecting algorithm. I had proposed a novel and highly effective feature combining LBP and HOG and trained a Neural Network with large quantities of samples. The trained classifier produced text region candidates with high confidence and then all regions were integrated and validated to form the final text blocks. This project contained various modules including image processing, feature extraction, classifier design which requiring a firm background of the pattern recognition theory. In addition, other skills such as programming, code optimization were also strongly required. Therefore, this project had greatly improved my capability in both academic research and practical application. Participant: Myself; Language: C++; Platform: Windows Visual Studio; Contribution: Algorithm design and simulation; design and implement a real-time demo system; a paper published at CJKPR 2009.
Working Experiences:
-
Senior Data and Applided Scientist with Microsoft, 2016.12~2020.2
I worked as a data and applied scientist with Microsoft Azure. Working on different AI and ML projects including face recognition, natrual language generation and ranking.
-
Internship with Microsoft, Data Scientist, Summer 2015
Internship with the Windows Core Quality Team at Microsoft Redmond. Develop a Belief Propagation Algorithm on the telemetry data for Windows user segmentation. Proposed a solution to address the challenge of the PU (positive-unlabeled) problem where only the positive labels are available in the dataset.
-
Algorithm Engineer with Athena Eyes, 2010.1~2010.7
I worked with Athena as an R&D engineer focusing on developing the latest face recognition techniques implemented on our products. During that time, I developed a human liveliness detection system based on blink detection. In addition, I designed and implemented the framework of face recognition algorithm based on Active Shape Model (ASM) for the next version face recognition system of Athena Eyes.
-
Research Intern with National Lab of Pattern Recognition, CASIA,
2009.6~2009.9
The research task focused on the text detection and recognition in videos. I extended my previous text detection algorithm by incorporating different feature extraction methods in single frames and exploiting temporal features between consecutive frames. I developed a demo system composed of a detection module and an OCR module in collaboration with other Ph.D. students in the lab. The demo can run in real time and the result is promising. In the project, I was in charge of algorithm design of texts location and character segmentation and system establishment. During the three-month time, I had strengthened my theory foundation, broaden my vision and improved my ability in both research and application by communication with the top Ph.D. students of Pattern Recognition in China.
-
Volunteer of 29th Olympic Games (Beijing 2008),
2008.7~2008.9
I served to welcome the delegation of the athletes, media and officials from different countries during the Olympic. I also provided language services in the Terminal 2 of Beijing Capital International Airport. All these experiences have strengthened my communication skills and enforced my spirits of teamwork and I will treat it as the most valuable treasure of my life.
Teaching:
-
Graduate Teach Assistant: Introduction to C Programming, Department of Computer Science, UCF, Fall 2010.
-
Graduate Teach Assistant: Discrete Math, Department of Computer Science, UCF, Fall 2013.
-
Graduate Teach Assistant: Fundamentals of Database Systems, Department of Computer Science, UCF, Spring 2014.
-
Guest lecture: GPU Processing for Distributed Live Video Database (ppt), Department of Computer Science, UCF, Spring 2015.
-
Guest lecture: A Brief Introduction to GPU Accelerations in Scientific Computation and Industry Applications, Department of Computer Science, UCF, Fall 2016.
-------------------------------------
Last updated on Feb 22, 2020.