Introduction to Data Structures (Com S 228)

This page gives access to information about an old course offerings of ``Introduction to Data Structures,'' as taught (in Spring 1995) by Gary T. Leavens for the Department of Computer Science at Iowa State University. Information is available under the following headings.

Also available are the following.


Overview

Computer Science 228 is an introduction to data structures, with emphasis on data abstraction and information hiding. (For the planning of Iowa State's introductory sequence, see Albert L. Baker, David Fernandez-Baca, and Gary T. Leavens ``Course Specifications for New Introductory Courses: Computer Science 228X and 228X,'' Department of Computer Science, Iowa State University, TR #92-09, April 1992.)

The following information reflects the Spring 1995 offering. Previous and subsequent offerings were taught by Albert L. Baker.

Course Description

This class is designed to give you essential programming tools and knowledge. You will learn how to model data in a computer, how to specify and use standard ADTs, and how to implement such ADTs with standard data structures.

You will learn how efficient or expensive various combinations of data structures and algorithms are. Along the way, you will develop skills in imperative programming. You will learn some of design principles and ideas for managing small (up to 500 line) programs. Finally we hope to share our sense of the satisfaction that comes from crafting an elegant, correct and efficient solution to a programming problem.

The ISU catalog description of the course is as follows:

An object-oriented approach to data structures and algorithms using [the] C++ language. Object-oriented programming. Program correctness. Stacks, queues, trees, searching, sorting, analysis of algorithms, graphs, and file processing. Emphasis on writing and running programs. This course is designed for majors. (4 credits).

Object-oriented programming involves the use of abstract data types (C++ classes are an implementation technique for ADTs), message passing (the C++ term is virtual function calls), and inheritance (the C++ term is class derivation). Technically, this class will not use object-oriented programming, but instead will use mainly object-based programming; this means using ADTs, but not message passing or inheritance. Sorry, but the catalog description is inaccurate.

A data structure is a representation (in a computer) for a collection of related information. Examples from Scheme are lists and vectors. A data structure can be used to implement an ADT, which can be thought of as a useful interface to several related data structures. The main ones ADTs discussed in the course are listed in the catalog description: stacks, queues, trees. (We will not cover graphs.)

(The distinction between an ADT and a data structure is that a particular data structure implies particular space and time resource usage, whereas such details are usually suppressed in the specification of an ADT.)

An algorithm is a method for performing some computational task. For example, a Scheme procedure embodies an algorithm, provided it always terminates when called. Many algorithms, such as varieties of searching and sorting algorithms, work with specific ADTs, and thus are best studied together with them and the data structures that implement the corresponding ADTs.

A program is correct when its possible execution sequences are a subset of those given by its specification. It is thus impossible to discuss program correctness without having a specification for the program. Informal specifications will be discussed in the course, as will informal techniques for developing correct programs.

Analysis of algorithms means deriving some measure or measures of the resource usage of a program. A computer scientist would probably phrase it as ``estimating the efficiency of an algorithm.'' The efficiency of an algorithm is often expressed using asymptotic notation (e.g., heap-sort is an O(n log n) time algorithm).

Graphs will not be discussed in this offering of 228. File processing will only be a minor part of the course.

To clarify a common misconception---this is not a course about C++. We will use C++, and you will learn about C++, but C++ is not the focus of the course. The focus is on data structures, and C++ is a good vehicle for that.

Administrative Information

Com S 228, ``Introduction to Programming,'' is usually taken by freshman and sophomores. The class has a ``lecture'' that meets 4 times a week, for 50 minutes a time. It also has a discussion section that meets once a week (for 50 minutes) with a teaching assistant. There are usually 58 or 59 lecture meetings in a semester. The course carries 4 credit hours.

Prerequisites

The formal prerequisite for Com S 228 in the ISU catalog is ``[Com S] 227, [and] credit or enrollment in Math 165.''

One reason for the math corequisite is that both programming and math involve problem solving. You won't get the maximum benefit from Com S 228 if you haven't done enough problem solving. Another reason is that there are various mathematical subjects (e.g., algorithm analysis, logic) that are used in the textbook. Finally, Math 165 is a prerequisite for subsequent courses in the theoretical part of the Computer Science curriculum (e.g., Com S 330).

You should have the following skills from Com S 227:

One important point is that we will assume that you are fully versed in the important topic of recursion.

Objectives

The general objectives for Com S 228 are divided into two parts: a set of essential objectives and a set of enrichment objectives. The essential objectives will be helpful for your career as a computer scientist. You are encouraged to explore the enrichment objectives, both for their own sake and because learning more about those will help deepen your understanding of the essential objectives.

Essential Objectives

In one sentence the essential goal is for you to acquire skills and knowledge in imperative programming and data structures. In more detail, you should:

The goal of this course is to give you a set of design tools (modules, ADTs, specifications as contracts) and computational tools (imperative programming, data structures, and algorithms) and to introduce you to two fascinating areas of Computer Science: the design of software systems, and the design and analysis of data structures and algorithms.

A working knowledge of standard ADTs and data structures is fundamental to both the theory and practice of computing. Most interesting computational problems require the use of some data structure. Implementation data structures are fundamental to most programming languages. By increasing your knowledge of ADTs, such as stacks, queues, lists, sets, trees, etc., you will be able to solve common programming problems quickly, correctly, and in a modular fashion.

ADTs are a key idea behind modern programming practice. An ADT captures the client's view of a data structure, and abstracts it from its possible implementations by a specification. This specification forms the contract between the implementor and the client. Seeing how ADTs are implemented in C++ will help you understand them, and will aid your programming in whatever language you happen to use. Since C++ is a fairly low-level language, learning how to implement ADTs in C++ will help you understand the efficiency (resource) issues involved.

Skills in C++ programming will be important for later in the curriculum and currently seem to be useful in getting certain jobs. However, note that C++ itself (as a language, in all its gory details) is *not* the focus of this course; only the details of C++ that are needed to achieve the other objectives will be taught. On the other hand, this is quite a useful subset of the language.

What is more important, are the skills of object-based design and programming that C++ gives a notation for. The ability to use existing classes is of increasing importance in writing production software, because one has to interface one's code to window systems, database systems, etc., which can all be presented as a collection of existing ADTs. The ability to design ADTs as needed, and to pick from one's knowledge of data structures useful ADTs and implementation techniques for them is important for you if you wish to write larger programs. Finally the ability to creatively develop algorithms is needed to actually compute results using the ADTs and data structures.

Information hiding is important when working with many people on a project, or when developing a larger program. It consists of the ability to compartmentalize the knowledge and decisions that make up a design and implementation.

Notions of time and space complexity are fundamental to the study of data structures and algorithms. Asymptotic notation is the most basic tool used in such analysis. This notation helps you compare data structures and algorithms, and to estimate which may be best for a given situation. Such distinctions are of practical importance for large amounts of data.

The ability to carefully reason about correctness is important in making you a more productive programmer. The time spent in careful design and reasoning is more than repaid in lessening the amount of time needed to debug your code.

Enrichment Objectives

Enrichment objectives could be multiplied endlessly. Listed here are those that would be easy to teach based on the text, or that various instructors might wish to teach.

Textbooks

There are two required texts.

The course texts, a reference manual for C++, and other books related to the course are on reserve at the library.

Syllabus

This syllabus tells when we will be discussing the various topics. The ``when'' is specified below by class meeting numbers (a count of the ``lectures''), and by a separate table (overleaf) that relates class meeting numbers to calendar dates. This syllabus will doubtless be changed, including the exact dates of tests. (Changes will be announced in class.)

In the syllabus, the abbreviation HR below means Headington and Riley's book Data Abstraction and Structures (D.C. Heath, 1994). The abbreviation DD means Deitel and Deitel's book, C++: How to Program (Prentice-Hall, 1994). These books and other reference materials are on reserve at the library.

Meetings Topic 		         Essential Readings  Other Readings
---------------------------------------------------------------
1-2	 Introduction		 
3	 Compiling C++ programs  DD 1.12-13, 1.14-16
4-11	 Scheme to C++		 Handout, HR App. A  DD 1.6-7, 1.17-19, 4.1-5
12-16    Control Abstraction	 HR 1		     DD 2, 3.3-9, 3.11, 3.15
17	 test
16-19	 Modules & Info. Hiding  HR 2		     DD 3.10, 18.5, 17.1-2
19-22	 Data Abstrac. & Classes HR 3		     DD 6, 7.1-4, 7.8, 8
23-26	 Client View of Data S.  HR 4, HR 12.1-3     HR 12.9
26-30	 Records & Variants	 HR 5		     DD 16.2-4, 18.12
31	 test
32-39	 Pointers & Dynamic Data HR 7		     DD 5, 7.5-6
40-43	 Linked Lists		 HR 8		     DD 15.4
44-47	 Design/Impl. of ADTs	 HR 9                DD 15.5-6
48,50-51 Searching & Hashing	 HR 12.4-6
49	 test
51-52	 Sorting (may slight)	 HR 12.7-8	     HR 12.10
52-56    Trees			 HR 13.1-3, 13.6-7   HR 13.4-5, DD 15.7
57	 File I/O		 DD 14.7-11
58-59	 Review and Course Eval
         final exam

Gary T. Leavens
229 Atanasoff Hall
Department of Computer Science, Iowa State University
Ames, Iowa 50011-1040 USA