COT 4810: Topics in Computer Science
Real-life Applications of Graph Theory and Linear Algebra
Ranking the Importance of Web Pages
Imagine a library containing 25 billion documents but with no centralized organization and no librarians. In addition, anyone may add a document at any time without telling anyone. You may feel sure that one of the documents contained in the collection has a piece of information that is vitally important to you, and, being impatient like most of us, you'd like to find it in a matter of seconds. How would you go about doing it?
Stated in this way, the problem seems impossible. Yet this description is not too different from the World Wide Web, a huge, highly-disorganized collection of documents in many different formats. Of course, we're all familiar with search engines so we know that there is a solution. The following articles and books describe the mathematics behind Google's PageRank algorithm and how it returns pages from the web's collection of billions of documents that match search criteria so well.
There are other algorithms that use the hyperlink structure of the web to rank the importance of web pages. One notable example is the HITS algorithm which forms the basis of the new search engine Teoma. This algorithm is described in both books listed below.
- PageRank on Wikipedia
- How Google Finds Your Needle in the Web's Haystack, article by David Austin, December 2006 Feature Column of Monthly Essays on Mathematical Topics, American Mathematical Society
- The second eigenvalue of the Google matrix, article by T. H. Haveliwala and S. D. Kamvar
- The $25,000,000,000 Eigenvector, article by K. Bryan and T. Leise
- Google's PageRank and Beyond: The Science of Search Engine Rankings, book by A. N. Langville and C. D. Meyer, Princeton University Press, 2006
- Web Communities: Analysis and Construction, book by Y. Zhang, J. X. Yu, J. Hou, Springer, 2006
Detecting Communities in Large Graphs
Many systems of current interest to the scientific community can usefully be represented as networks (or graphs). Examples include the Internet and the world-wide web, social networks, citation networks, and biochemical networks. Each of these networks consists of a set of nodes or vertices representing, for instance, computers or routers on the
Internet or people in a social network, connected together
by links or edges, representing data connections between
computers, friendships between people, and so forth.
One network feature that has been emphasized in
recent work is community structure, the gathering of
vertices into groups such that there is a higher den-
sity of edges within groups than between them.
- Discovering Communities in Complex Networks, article by H. Balakrishnan and N. Deo
- Mining Parameters that Characterize the Communities in Web-like Networks, article by N. Deo and A. Cami
- Finding Community Structure in Networks Using the Eigenvectors of Matrices, article by M. E. J. Newman, Physical Review E 74, 036104 (2006)
- Finding Communities in Linear Time: a Physics Approach, article by F. Wu and B. A. Huberman, The European Physical Journal B 38, 331–338 (2004)
- Web Communities: Analysis and Construction, book by Y. Zhang, J. X. Yu, J. Hou, Springer, 2006
- Google's PageRank and Beyond: The Science of Search Engine Rankings, book by A. N. Langville and C. D. Meyer, Princeton University Press, 2006
Quantum Speed-Ups of Graph-Theoretical Problems