I. introduction to trees (HR 13) A. motivation ------------------------------------------ TREES (HR 13) Motivating problems: Representing hierarchical information Data structure that supports: 1. changes - insertion - deletion 2. retrieval - fast searching - traversal in sorted order ------------------------------------------ B. pictures ------------------------------------------ A HIERARCHY TREE things living non-living animals plants matter ideas vertebrates trees cars boats goodness mammals birds redwoods A FAMILY TREE Iphigenia Clytemnestra Agamemnon Leda Tyndareurs Aerope Atreus Catreus ------------------------------------------ What are some other examples of trees? ------------------------------------------ EXPRESSION TREES (HR 13.5) 2 + 3: + 2 3 partition(vec, 0, 1008-1): partition vec 0 - 1008 1 ------------------------------------------ ------------------------------------------ COUNTEREXAMPLES A B D I A V Reason: C Y E L Reason: ------------------------------------------ C. terminology (HR 13.1) ------------------------------------------ TREE TERMINOLOGY (compare HR 13.1) partition vec + - 3 4 1008 1 def: a *tree* (or general tree) is a set of *nodes* together with a child-of relation such that: 1. if the tree is not empty, then there is a unique node, the *root* that is 2. every node, except for the root, is the child of ------------------------------------------ How many roots can a tree have? ------------------------------------------ def: there is a *branch* from p to c if def: p is a *parent* of c iff def: c1 is a *sibling* of c2 iff def: g is an *ancestor* of k iff either or (and thus k is a *descendent* of g). def: a *leaf* is ------------------------------------------ Are 3 and 4 siblings? Are 3 and 1008 siblings? must there be a branch from the root to each leaf in a tree? ------------------------------------------ TREE TERMINOLOGY partition vec + - 3 4 1008 1 def: a *subtree* of a tree T is either the empty tree, or a node and all its descendents def: the *level* of a node n is: 1 + the number of branches to the _____ from n def: the *height* or *depth* of a tree is ------------------------------------------ What is the level of the root? WHat is the level of +? of 3? II. binary trees and binary search trees (HR 13.2-4) A. binary trees (HR 13.2) ------------------------------------------ BINARY TREES (HR 13.2) Motivating problems: Searching or sorting a collection that has lots of inserts and deletes to process ------------------------------------------ 1. definitions ------------------------------------------ def: a *binary tree* is a tree such that each node has Examples: A Z L B C X I Y N W V E ------------------------------------------ 2. example problem, design, and implementation a. problem ------------------------------------------ PROBLEM Design a guest register for a hotel that supports average time O(log N) insertion, deletion, and searching, and permits outputs in sorted order. Information to maintain on Guests: name ------------------------------------------ What other information might one maintain? What should the types of these be? ------------------------------------------ GUEST INFORMATION STRUCT // GuestInfo.h #ifndef _GuestInfo_h #define _GuestInfo_h 1 #include "String.h" struct GuestInfo { String name; String address; int roomNumber; float charges; }; #endif ------------------------------------------ b. design ------------------------------------------ LAYERED DESIGN OF SOLUTION main GuestDatabase DisplayMenu GuestRegister GuestRegister Insert Delete SearchFor InOrderPrint GuestTree IOStuff ------------------------------------------ What does each module hide? 3. implementation layers a. main ------------------------------------------ // test.C #include #include #include #include "prompt.h" #include "GuestDatabase.h" #include "DisplayMenu.h" int main() // MODIFIES: cin, cout, cerr { char cmd; // prompt DisplayMenu(); prompt("your choice? "); cin >> cmd; while (cin) { cin.ignore(100, '\n'); switch (tolower(cmd)) { case 'a': AddGuest(); break; case 'b': PrintSortedGuestList(); break; case 'c': SearchForGuest(); break; case 'd': DeleteGuest(); break; case 'e': return(EXIT_SUCCESS); break; default: cerr << "Error: unknown command" << endl; break; } // prompt DisplayMenu(); prompt("your choice? "); cin >> cmd; } } ------------------------------------------ b. GuestDatabase ------------------------------------------ // GuestDatabase.h #include "GuestInfo.h" extern void AddGuest(); // MODIFIES: cout, cin // POST: prompts are added to cout, // and information about a guest is ____ // ________. This information is stored // in the database. extern void DeleteGuest(); // MODIFIES: cout, cin, cerr // POST: a prompt is added to cout, and // a guest name is read from cin. The // guest is removed if in the database, // and an error message is issued if // the guest is not in the database. extern void SearchForGuest(); // MODIFIES: cout, cin, cerr // POST: a prompt is added to cout, and // the a guest name is read from cin. // The guest's information is added // to cout if the guest is found, // otherwise an error is added to cerr. extern void PrintSortedGuestList(); // MODIFIES: cout // POST: a list of guest information is // added to cout; the information is in // ascending order by _______________. ------------------------------------------ ------------------------------------------ // GuestDatabase.C #include #include "bool.h" #include "prompt.h" #include "IOStuff.h" #include "GuestDatabase.h" #include "GuestRegister.h" static GuestRegister guestDB; void AddGuest() { GuestInfo g; prompt("Guest name? "); cin >> g.name; Boolean found; guestDB.SearchFor(g.name, found, g); if (found) { cerr << "Already registered as:\n" << g; } else { prompt("address (on one line)? "); cin >> g.address; prompt("room number? "); cin >> g.roomNumber; cin.ignore(100, '\n'); g.charges = 0.0; guestDB.Insert(g); } } ------------------------------------------ ------------------------------------------ void SearchForGuest() { String name; prompt("Guest name? "); cin >> name; GuestInfo g; Boolean found; guestDB.SearchFor(name, found, g); if (found) { cout << g; } else { cerr << "Error: guest " << name << " not found!" << endl; } } void DeleteGuest() { String name; prompt("Guest name? "); cin >> name; Boolean found; guestDB.Delete(name, found); if (!found) { cerr << "Error: guest " << name << " not found!" << endl; } } ------------------------------------------ c. GuestRegister (part 1) ------------------------------------------ // GuestRegister.h #ifndef _GuestRegister_h #define _GuestRegister_h 1 #include "bool.h" #include "GuestInfo.h" #include "GuestTree.h" class GuestRegister { public: // ABSTRACTLY: a set of guest // information records, // with only one record with // a given guest name. GuestRegister(); // MODIFIES: self // POST: self is the empty set void Insert(const GuestInfo & g); // PRE: no record with g.name is in self // MODIFIES: self // POST: void Delete(const String & name, Boolean & found); // MODIFIES: self // POST: // && found --> void SearchFor(const String & name, Boolean & found, GuestInfo& g) const; // MODIFIES: found, g // POST: found --> // void InOrderPrint() const; // MODIFIES: cout // POST: // // #include "GuestRegister.pri" }; #endif ------------------------------------------ ------------------------------------------ // GuestRegister.pri private: GuestTree theTree; // ABSTRACTION MAP: the set of guest // information records consists of // the nodes of theTree ------------------------------------------ d. GuestTree, representing binary trees (HR 2. 611) ------------------------------------------ REPRESENTING BINARY TREES #include "GuestInfo.h" struct TreeNode { GuestInfo data; TreeNode *lLink; // left child TreeNode *rLink; // right child // constructor TreeNode( const GuestInfo & g, TreeNode *l, NodePtr *r); }; ------------------------------------------ B. operations on binary trees ------------------------------------------ ADDING A LEAF TO A BINARY TREE struct TreeNode { GuestInfo data; TreeNode *lLink; // left child TreeNode *rLink; // right child // constructor TreeNode( const GuestInfo & g, TreeNode *l, NodePtr *r); }; typedef TreeNode *NodePtr; Before: After: void AppendLeft( NodePtr ptr, const GuestInfo & g ) { } ------------------------------------------ What would you change to make this AppendRight? ------------------------------------------ OTHER OPERATIONS ON BINARY TREES #include "GuestTree.h" GuestInfo Data( NodePtr ptr ) { return ptr->data; } void Store( NodePtr ptr, const GuestInfo & g ) { ptr->data = g; } NodePtr LChild( NodePtr ptr ) { return (ptr==NULL) ? NULL : ptr->lLink; } NodePtr RChild( NodePtr ptr ) { return (ptr==NULL) ? NULL : ptr->rLink; } Boolean IsLeaf( NodePtr ptr ) { } ------------------------------------------ ------------------------------------------ WORKING AT THE ROOT // GuestTree.h #include "bool.h" #include "GuestInfo.h" struct TreeNode; typedef TreeNode* NodePtr; class GuestTree { public: // ABSTRACTLY: a tree of guest // information records GuestTree(); // MODIFIES: self // POST: self is the empty tree Boolean IsEmpty() const; // POST: FCTVAL == (tree has no nodes) void BuildRoot( const GuestInfo & g ); // PRE: IsEmpty() // MODIFIES: self // POST: Tree has root node, call it R // && Contents(R) == g && IsLeaf(R) NodePtr Root() const; // POST: FCTVAL == NULL, if IsEmpty() // == pointer to root node, otherwise #include "GuestTree.pri" }; // ... declarations of stuff seen before ------------------------------------------ ------------------------------------------ // GuestTree.pri private: NodePtr rootPtr; // ABSTRACTION MAP: if rootPtr == NULL // then the tree is empty, // otherwise the tree contains the // record in *rootPtr and all records // found by following its pointers ------------------------------------------ ------------------------------------------ FOR YOU TO DO Implement the class GuestTree, starting as follows // GuestTree.C #include #include "IOStuff.h" #include "GuestTree.h" struct TreeNode { GuestInfo data; NodePtr lLink; // left child NodePtr rLink; // right child // constructor TreeNode( const GuestInfo &, NodePtr, NodePtr ); }; ------------------------------------------ C. binary search trees (HR 13.3) ------------------------------------------ INSERTION AND SEARCHING Problem: how to insert in O(log N) time, so can search in O(log N) time (on average). Data structure: binary search tree def: a *binary search tree* is a binary tree such that: 1. 2. 3. Pictures: ------------------------------------------ What has to be true about a child node? 1. insertion ------------------------------------------ // GuestRegister.C #include "GuestRegister.h" GuestRegister::GuestRegister() : theTree() { } void GuestRegister::Insert( const GuestInfo & g) { } ------------------------------------------ 2. deletion (skip) 3. searching (HR 617) ------------------------------------------ FOR YOU TO DO Implement the following: // GuestRegister.C (continued) void GuestRegister::SearchFor( const String & name, Boolean & found, GuestInfo& g) const { } ------------------------------------------ How is this similar to binary search? What's the time efficiency? D. traversals (HR 13.4) ------------------------------------------ TRAVERSALS Motivating problems: - print information on the the guests in alphabetic order - convert from infix to prefix notation - delete all the nodes in a tree Gary Cecil Hillary Ben Eve Ivan Ada Dan Fricka Lenny Jim Karl def: an *inorder* traversal visits def: a *preorder* traversal visits def: a *postorder* traversal visits ------------------------------------------ can you give the output from each on this tree? ------------------------------------------ INORDER TRAVERSAL // GuestTree.h #include "bool.h" #include "GuestInfo.h" struct TreeNode; typedef TreeNode* NodePtr; // ... extern void PrintInOrder( NodePtr ptr ); // POST: information about the subtree // at ptr is printed to cout in order. // GuestTree.C #include #include "IOStuff.h" #include "GuestTree.h" // ... void PrintInOrder( NodePtr ptr ) { } ------------------------------------------ What change to this code would you make to print in preorder? E. destructor (HR p. 624) What kind of traversal is appropriate for writing a destructor? Can you write it? F. copy constructor What kind of traversal is appropriate for writing a copy constructor? Can you write it? III. sorting with binary trees (HR 13.6-7) ------------------------------------------ SORTING WITH BINARY TREES (HR 13.6-7) Problem: 2 Quicksort has O(N ) worst case time. Can we do better by using a clever data structure? Idea: use a binary search tree, t. Initialize t to empty; For each data value, v, in input: Insert(v, t); Pictures: ------------------------------------------ What's the average case efficiency? ------------------------------------------ PROBLEMS WITH NAIVE USE OF TREES Time overhead: - building the tree from array - inorder traversal - filling up the array again Worst case time: 2 - still O(N ) Space overhead: - 2 pointers for each node ------------------------------------------ A. treesort, a worst case time O(N log N) sort (HR 13.6) ------------------------------------------ TREESORT Phase 1: E E E 7 3 4 1 ------------------------------------------ how long does each phase take? ------------------------------------------ REPRESENTATION ISSUES n What if don't have 2 - 1 data items? ------------------------------------------ ------------------------------------------ How to represent E (empty)? Don't want it to be promoted So has to either - be a tag field of the nodes - or be number ____________________ ------------------------------------------ Which will be faster? ------------------------------------------ WORST CASE TIME COST OF TREESORT Phase 1: elements to move: N max length of move: log N total: Phase 2: elements to extract: N time to promote next: log N total: Overall: ------------------------------------------ B. vector implementation of binary trees (HR 13.7) ------------------------------------------ VECTOR IMPLEMENTATION OF BINARY TREES (HR 13.7) Problem: eliminate overhead for pointers Idea: use array node at index k: Left child at 2*k + 1 Right child at 2*k + 2 A 0 A 1 B B C 2 C A 0 A 1 B B C 2 C 3 D D E F G 4 E 5 F 6 G ------------------------------------------ What's the mapping from the right to the left called? How do you find the index of a parent in the tree from the index of a child? The vectors in the pictures are sorted, where is the largest element in the vector? ------------------------------------------ REPRESENTATION ISSUES Wasted space? How much space for tree of height N? ------------------------------------------ ------------------------------------------ Unused array elements? ------------------------------------------ C. heapsort (HR pp. 640ff) ------------------------------------------ HEAPSORT (HR pp. 640ff) Combines both ideas: - use fully balanced tree ==> O(N log N) time - use vector representation of trees ==> no cost to move data into tree ==> no cost to extract data ==> less overhead Phase 1: arrange data so largest value is in root of each subtree def: a *heap* is ------------------------------------------ ------------------------------------------ Phase 2: remove largest element, put into its place in sorted vector, repeat ------------------------------------------ 1. examples ------------------------------------------ EXAMPLES OF HEAPSORT A 0 A 1 C C D 2 D Phase 1: ----------------------------------------- ------------------------------------------ Phase 2: C 0 C 1 A A 2 D ------------------------------------------ ------------------------------------------ ANOTHER EXAMPLE A 0 A 1 C C D 2 D 3 I I B F G 4 B 5 F D 6 G 7 D Phase 1: ------------------------------------------ suppose you start with C O M P U T I N G, what's the heap? How was one step of phase 2 like phase 1? How do we know to stop phase 2? 2. coding What data needs to be kept for phase 1? What's the invariant? What data needs to be kept for phase 2? What's the invariant? IV. summary of ideas in HR chapters 12 and 13 ------------------------------------------ SUMMARY OF KEY IDEAS IN DATA STRUCTURES (HR 12-13) Divide and conqueor: binary search quicksort binary search trees heapsort Changing data structure leads to qualitative improvement. binary vs. linear search heapsort vs. selection sort ------------------------------------------