
Federated Search of Text Search Engines
Mr. Luo Si
Abstract My dissertation research addresses the three main research problems within federated search: resource representation, resource selection and results merging. New algorithms have been proposed for estimating information source sizes, estimating distributions of relevant documents across information sources for resource selection, and merging document rankings returned by selected sources. Furthermore, a unified utility maximization framework is proposed to combine the range of solutions together to construct effective systems for different federated search applications. Empirical studies in a wide range of research environments and a real world prototype system under different operating conditions have demonstrated the effectiveness of the research. This new research, supported by a more theoretical foundation, better empirical results, and more realistic simulation of real world applications, substantially improves the state-of-the-art of federated search.
Short BioLuo Si is a Ph.D. candidate at the Language Technologies Institute, a department in Carnegie Mellon's School of Computer Science. He received his M.S. and B.S. degrees in Computer Science from Tsinghua University. His research spans a range of topics in information retrieval, machine learning, text mining, speech and multimedia processing, and data mining. His recent research focuses on federated search (distributed information retrieval), probabilistic models for collaborative filtering, and text/data mining for bioinformatics. He has published more than 35 conference, journal and workshop papers. |