CS 342 Lecture -*- Outline -*- * C introduction implementation language for Unix => Important commercial language integrated with other Unix utilities (such as lex and yacc) => used in the compiler course (442) MOHLL -- machine-oriented high level language ** History and Goals CPL (Cambridge, around 1963) BCPL (M. Richards, et al, 1967), typeless systems programming lang. Algol 68 (1968), typed descendent of Algol 60 B (K. Thompson, 1970), also typeless C (Kernighan, Ritchie, 1972) **Goals *** systems programming langauge higher level than assembler but allowing access to machine (pointers, registers) efficiency *** types and type checking to solve technical problems untyped languages e.g., word-oriented typeless langauges such as BCPL and B + vs. .+ for floating point addition addresses vs. integers 4-8 byte floats later: portability, displacing assembly language ** Syntax (terse!) like Algol, but with { } instead of begin-end and ; is terminator. *but don't need ; after } no nesting of functions more efficient function calls (exercise). simplifies implementation *base case of statement is expression (expression language) switch is case stmt, case is label, have to use break while and if have boolean expression in parentheses /* comment */ = is assignment, use == for equality % for remainder, ++ for increment, prefix or postfixed (*give examples) +=, -=, etc. &&, || short-circuit ?: for conditional expresion , combines expressions, first executed for side effect, the second gives value ** Data types *** void (C is an expression language) *** char *** int, short (16 bits), long (32 bits), unsigned also (short int = short, long int = long, unsigned int) distinctions help portability *** float, double also (long float = double) floats coerced to doubles in expressions. *** Arrays and pointers pointers, even to stack objects: left and right values, & and * can have dangling references arrays static, indexes from 0 to bound. array names converted to pointer to first element a[i] same as *(a + i) same as i[a] (!) --------- void arrays_and_pointers() { int i; int *e, a[3]; /* indexing */ for (i = 0; i < 3; i++) { a[i] = i; } /* explicit use of pointers */ for (e = a; e < &a[3]; e++) { printf("%d\n", *e); } } ------------- *** structures (struct), like Pascal records struct oper { short opcode, addr[3]; } op, im[1000]; op.opcode, im[30].addr[2]; *** unions, no tag field! (compare to FORTRAN equiavlence) union ci { char c; int i} x; x.c; *** functions name converted to pointer to function if not being called can be placed in data structures allows table driven programs *** strings are arrays of characters, terminated by a null character '\0' have to watch for dangling references (e.g., when returning a string from a function) string variables cannot be initialized in decl unless static. strcmp - compares strings (lexicographic order, 0 means equal) ** Declarations *read as templates for expressions int f(); char *argv[]; argv is an array of pointers (* has higher precedence than [] and ()). function returning a pointer to an integer pointer to function returning an integer Type defs introduce synonyms. typedef data_word double; typedef struct { short opcode, addr[3]; } operation; **Type checking *** array bounds not part of type (except for storage allocation) since array names are converted to pointers *** structural or name equivalence? structural for scalars, pointers, but name for structures! however, the names are compared structurally in that the same name of a structure means the same thing in different scopes ------------ structural_equivalence() { struct oper { short opcode, addr[3]; }; typedef struct { short opcode, addr[3]; } operation; typedef data_word double; typedef myint int; int i; myint j; double d; data_word dw; struct oper op0; struct { short opcode, addr[3]; } op1; operation op2; j = 3; i = j; d = 3.0; dw = d; op2.opcode = 3; for (i = 0; i < 3; i++) { op2.addr[i] = 0; } /* following are illegal! */ op1 = op2; op0 = op1; op2 = op0; } --------------- *** coercions float -> double, also double -> float char -> short -> int -> long, also other way *** explicit casts int n; char c; n = (int)c - (int)'0'; often see casts of integers to pointers and vice versa, actually helps portability! (language has info it needs to do right thing) no type checking of function calls (types of formals are unknown). unions are not checked dyamically (no tags) no array bounds checking