Computer Science 202 Lecture Summaries, Fall 2016


Discussion of the syllabus, the role of the class in the Computer Science major, comparison with Math 244.

Two knights and knaves problems. Ahmose Papyrus fractions: 2/7 = 1/4 + 1/28 and 3/7 = 1/4 + 1/7 + 1/28 or 3/7 = 1/3 + 1/11 + 1/231. Development of a recursive greedy algorithm to compute a representation of p/q as a finite sum of distinct unit fractions. Questions: Does the algorithm always terminate? Does it always produce distinct unit fractions? Does it produce the minimum number of terms? Can 3/7 be represented as a sum of two distinct unit fractions?

The first reading assignment and problem set are available.


Various uses of logic in computer science: specifying and analyzing circuits in terms of AND, OR, NOT gates, specifying Boolean conditions in "if" or "while" statements in programming languages, expressing correctnesss and running time conditions of algorithms and programs, and others.

Propositional logic (Reading: Chapter 2). Propositions, true (1), false (0), propositional variables (p, q, r, ...), logical operators for not, and, inclusive or, exclusive or, implication, biconditional and their truth tables, a recursive (or inductive) definition of propositional formulas, precedence (or order) of operators.

An example problem (problem 2 from hw #1, Fall 2015) and solution, in conjunction with Polya's "How to Solve It" questions and advice. (Extras of handout available outside my office, 414 AKW.)


Propositional logic continued. (Reading: Chapter 2). Truth tables for more complex formulas, for example, (not ((p implies q) and (q implies r))), tautology, contradiction, contingency, satisfiable, logical equivalence of two propositional formulas, and testing logical equivalence using a truth table, logical equivalence of (p implies q) and ((not p) or q), De Morgan's Laws, distributive laws, the law of double negation (see Tables 2.2 and 2.3 in the text), P and Q are logically equivalent if and only if (P iff Q) is a tautology, proof that (((p implies q) and (q implies r)) implies (p implies r)) is a tautology (this is the transitive property of implication). The logical rule Modus Ponens (based on the tautology (((p implies q) and p) implies q)) and the fallacies of affirming the consequent and denying the antecedent.

Predicate logic (Reading: Ch 2). In propositional logic, the statement "All persons are mortal" is either true or false, and cannot be broken down further. In predicate logic we can make a model of the parts of this statement by defining predicate symbols P(x) for "x is a person" and M(x) for "x is mortal", and then representing the statement as the formula (forall x)(P(x) implies M(x)). Predicates may have more than one argument, for example, T(x,y) for "x is taller than y" or B(x,y,z) for "x is between y and z". There may also be function symbols. In the domain of natural numbers (0, 1, 2, ...), we can define a function symbol s(x) to be x+1 or a function symbol m(x,y) to be (x * y). A function takes one or more elements of the domain (natural numbers) and returns an element of the domain (a natural number). A predicate takes one or more elements of the domain (natural numbers) and returns a truth value: true (1) or false (0). In this domain we could define the predicate symbol D(x,y) for "x divides y", that is, "y is a multiple of x". Then a formula asserting that x divides y if and only if there exists a number c such that cx = y could be written (forall x)(forall y)(D(x,y) iff (exists c)(y = m(c,x))).


Predicate logic continued. We considered a domain consisting of 9 elements (triangle, circle or square) arranged in a 3x3 grid. We introduced constants a,b,c,d,e,f,g,h,i to refer to the individual elements, and unary predicates T(x) = "x is a triangle", C(x) = "x is a circle", S(x) = "x is a square". This allowed us to write ground (no variables) atomic formulas like C(a) (which was false in our domain) and (C(a) or C(b)) (which was true in our domain.) We also introduced a binary predicate N(x,y) = "x and y are next to each other (up or down, left or right)", which allows us to write ground atomic formulas like N(b,e) (which is true in our domain.) Finally, we introduced a unary function v(x) = "the element directly above x, or (wrapping around) if x is in the top row, the element in the same column but the bottom row." Thus v(e) = b, and v(v(e)) = h. The signature of a system is the list of constants and the predicate and function symbols and their "arities" (the number of arguments they take.)

We inductively define terms as constants or variables, or functions applied to the correct number of terms as arguments. So x, y, b, v(x), v(v(b)), and so on are terms, which can represent individual elements of the domain. (If we had a binary function symbol h, then another term would be v(h(h(a,v(v(b))),h(x,v(y)))).)

An atomic formula is a predicate applied to the correct number of terms. A predicate takes a fixed number of individual elements of the domain and returns a truth value (1 or 0). Examples of atomic formulas: T(a), N(x,v(v(x))). Each ground (no variables) atomic formula is either true (1) or false (0) in the domain. With atomic formulas we make the leap from domain elements to truth values. Atomic formulas can be combined using the logical operators and quantified with "for all" and "there exists" to yield more complex formulas.

An inductive definition of formulas: every atomic formula is a formula, and if F and G are formulas, so are (F and G), (F or G), (F xor G), (F implies G) and (F iff G), as well as (forall x)(F) and (exists x)(F) for any variable x.

We proceeded to translate statements into formulas. (1) "No triangle is a square." Translations were the logically equivalent formulas (not (exists x)(T(x) and S(x))), alternatively, (forall x)(not (T(x) and S(x))). (2) "Every triangle is next to a circle." Translations were the logically equivalent formulas (forall x)(exists y)(T(x) implies (C(y) and N(x,y))), alternatively, (forall x)(T(x) implies (exists y)(C(y) and N(x,y))). In the case of (2), we convinced ourselves that the following formulas were NOT correct translations: (3) (forall x)(exists y)((T(x) and C(y)) implies N(x,y)), (4) (forall x)(exists y)(T(x) and C(y) and N(x,y)).

De Morgan's Laws for quantifiers. In our finite domain, where we have a name for every element of the domain, we could express "There is a circle" by (exists x)(C(x)) or by the disjunction (C(a) or C(b) or C(c) or C(d) or C(e) or C(f) or C(g) or C(h) or C(i)). Intuitively, we can think of (exists x) as abbreviating a giant "or". Similarly, (forall x) can be thought of as abbreviating a giant "and" saying that the statement holds for each of the individual elements. If we negate (exists x)(C(x)), we could equivalently negate the giant "or" and then apply De Morgan's law (for negating an "or") to get (not(C(a)) and not(C(b)) and not(C(c)) and ... and not(C(i))), which is equivalent to (forall x)(not(C(x))). So De Morgan's laws apply to quantified formulas, giving us (not(exists x)(C(x))) is equivalent to (forall x)(not(C(x))) and, dually, (not(forall x)(C(x))) is equivalent to (exists x)(not(C(x))). (This justifies the equivalence of the two formulas for (1) in the previous paragraph.)


Proof. The ancient Greeks are credited with introducing the axiomatic method. Euclid's Elements is a compendium of the mathematics of the time (mostly geometry) starting with axioms and rules of inference, and building up a large collection of theorems, lemmas, corollaries.

The "If you give a mouse a cookie .." example proof from the notes. The inference rule Modus Ponens is justified by the tautology ((P and (P implies Q)) implies Q).

The "turnstile" relation between a sequence of premises P_1, P_2, ..., P_n and a conclusion Q asserts that there exists a proof of the conclusion from the premises. What is its relationship to the claim that the premises logically imply the conclusion? A proof system is *sound* if the existence of a proof implies that the conclusion is a logical consequence of the premises. A proof system is *complete* if whenever the conclusion is a logical consequence of the premises, there exists a proof of that fact. Unsound proof systems can typically prove everything, and are undesirable. Not many proof systems are complete (an example is given by a suitable set of axioms for the equivalence of propositional formulas with an inference rule of substitution of equivalent formulas), and some important ones are provably incomplete (an example is the nonnegative integers with addition and multiplication; see Godel's first incompleteness theorem.)

In the domain of nonnegative integers (0, 1, 2, 3, ...) we define divides(a,b) iff there exists a nonnegative integer d such that b = d * a, where * represents multiplication. Examples: divides(3,12) is true, divides(3,11) is false, and divides(0,0) is true. Lemma: if a divides b and a divides c then a divides (b+c). Two proofs were given: an informal proof and a more formal proof. The analogy: an algorithm in pseudocode is to a program implementing the algorithm as an informal proof is to a formal (say, natural deduction) version of that proof.

We define prime(n) iff n is not equal to 1 and the only nonnegative integers that divide n are 1 and n. This definition can be expressed by the following predicate logic formula: (forall n)(prime(n) iff (not(x=1) and (forall x)(divides(x,n) implies ((x=1) or (x=n))))). Theorem: There is no largest prime. The statement in this theorem can be expressed by the following predicate logic formula: (forall n)(prime(n) implies (exists x)(prime(x) and (x > n))). See homework #2 for more on this topic.


Set Theory (Chapter 3). We will focus on Naive Set Theory, as opposed to Axiomatic Set Theory. A set is a collection of elements -- neither order nor multiplicity matters. The axiom of extensionality says that two sets are equal iff they have exactly the same elements. In practice that means that to prove two sets A and B are equal, we show that every element of A is an element of B and every element of B is an element of A. See the text for notation for "x is an element of S" and "x is not an element of S".

Constructing sets. There is a unique empty set, which has no elements. We can specify a set by explicitly listing its elements between curly braces, for example, {1,4,9}. Sets can be elements of sets, for example, A = {{}, {1,3}, 9, 15} is a set of four elements: (1) the empty set {}, (2) the set containing the elements 1 and 3, (3) 9, and (4) 15. Note that while {} and {1,3} are elements of A, 3 is not an element of A. See the text for set builder notation and the set of natural numbers.

A binary relation between sets. A is a subset of B iff every element of A is an element of B. B is a superset of A iff A is a subset of B. Note that A = B iff A is a subset of B and B is a subset of A. In practice, this means that one good approach to proving A = B is to prove two things: (1) A is a subset of B and (2) B is a subset of A. Venn diagrams for A is a subset of B, and also A is not a subset of B. See the text for the notation for subset and superset.

New sets from old. Given sets A and B, there are operations of union, intersection and set difference to construct new sets. (A union B) is the set of all x such that x is an element of A or x is an element of B, or both. (A intersect B) is the set of all x such that x is an element of A and x is an element of B. (A setminus B) is the set of all x such that x is an element of A and x is not an element of B. Venn diagrams for these.

Proof that A = (A intersect B) union (A setminus B). (This is in the notes.)

Construction of the natural numbers {0, 1, 2, 3, ...} from the empty set. Identify the number 0 with the empty set {}, which has 0 elements. Identify the number 1 with the set {0} = {{}}, which has 1 element. Identify the number 2 with the set {0,1} = {{}.{{}}}, which has 2 elements. Identify the number 3 with the set {0,1,2} = {{},{{}},{{},{{}}}}, which has 3 elements. In general, assuming we have constructed the set S_n for the natural number n, the way we get the set S_{n+1} for the natural number n+1 is to add the element n to the set S_n, which we do as follows: S_{n+1} = (S_n union {S_n}). One advantage of this construction that the set representing n has exactly n elements, namely the sets representing 0, 1, ..., n-1.


Set Theory, continued (Chapter 3). Question: What is the set {x | x is not an element of x}?

Informally, |A| is the cardinality or the number of elements of the set A. This accords with your previous experience when A is finite. We'll make it more precise at the end of the lecture.

The power set of a set A is P(A) = {B | B is a subset of A}. Example of P(A) for A = {2,3,5}. In this case, P(A) has 8 = 2^3 elements; in general, for a finite set A, P(A) has 2^|A| elements. Representing the subsets of A as characteristic vectors, binary vectors with 0 or 1 for each element of A.

Ordered pairs. We denote the ordered pair of a and b by (a,b). The fundamental property we want is that (a,b) = (c,d) iff a = c and b = d. Kuratowski's representation of the ordered pair (a,b) by the set {{a},{a,b}}. For example, (3,5) is represented by {{3},{3,5}} and (3,3) is represented by {{3},{3,3}} = {{3}} (because multiplicity of membership doesn't matter for sets.)

The Cartesian product of two sets, A x B, is the set of of all ordered pairs (a,b) such that a is an element of A and b is an element of B. Example of the Cartesian product of {3,5} and {1,2,5}. |A x B| is |A|*|B|, the product of |A| and |B|, for finite sets A and B. A (binary) relation on sets A and B is any subset of A x B. Example of a binary relation on {3,5} and {1,2,5}: {(3,1),(5,1),(5,5)}. A diagram of this relation with two dots on the left (labeled with 3 and 5) and three dots on the right (labeled with 1, 2, and 5), and three arrows: one from 3 on the left to 1 on the right, one from 5 on the left to 1 on the right, and one from 5 on the left to 5 on the right, representing the three ordered pairs in the relation: (3,1), (5,1) and (5,5). The number of binary relations on A and B is 2^(|A|*|B|), for finite A and B, combining the results on the size of A x B and the number of subsets of a set.

A function with domain A and co-domain B is a binary relation f on A and B that satisfies two additional properties: (1) For every element a of A there exists an element b of B such that (a,b) is an element of f, and (2) For no element a of A are there two elements b1 and b2 of B that are not equal to each other and are such that (a,b1) and (a,b2) are both elements of f. If f is a function, we write f(a) = b instead of (a,b) is an element of f. The quantifier "There exists one and only one" (notation: standard existential quantifier with an exclamation point after it) and expressing it using the quantifiers of "there exists" and "for all". Notation f:A -> B for "f is a function with domain A and co-domain B".

Examples of binary relations on A and B that violate one of the two conditions. Illustration of the conditions in terms of graphs of functions with domain = co-domain = the real numbers.

Properties functions may have. A function f with domain A and co-domain B is injective (one-to-one) iff there do not exist elements a1 and a2 of A such that a1 is not equal to a2 and f(a1) = f(a2). In terms of the diagram of f with dots and arrows: no element on the right (in B) has more than one arrow arriving at it. (So elements on the right may have zero or one arrows arriving.) A function f with domain A and co-domain B is surjective (onto) iff for every element b of B there exists at least one element a of A such that f(a) = b. In terms of the diagram of f: every element on the right (in B) has at least one arrow arriving at it. A function f with domain A and co-domain B is bijective (a one-to-one correspondence) iff it is both injective and surjective.

Cardinality. Two sets A and B have the same cardinality iff there is a bijective function f with domain A and co-domain B. This works as you would expect for finite sets: all sets with 17 elements have the same cardinality, but it has the advantage that it also covers infinite sets and gives a useful notion of the various sizes of infinite sets.


Set Theory, continued (Chapter 3). Question: What is the set {x | x is not an element of x}?

Review of function, domain, co-domain, injective, surjective, and bijective. Two sets A and B have the same cardinality iff there is a bijection f:A -> B. Construction of a bijection h from the natural numbers to the integers, showing that the natural numbers and the set of all integers have the same cardinality. The set A is countable iff it has the same cardinality as some subset of the set of natural numbers. Thus, finite sets are countable, but so are the set of natural numbers and the set of integers.

There are uncountable infinite sets. One example is the set of all infinite binary sequences B. If we have a function g whose domain is the natural numbers and whose co-domain is B, then there exists an infinite binary sequence a_0, a_1, a_2, a_3, ... in B that is not equal to g(n) for any natural number n. We can define such an infinite binary sequence by taking a_i to be 1 if the bit at index i of b(i) is 0, and a_i to be 0 otherwise. The sequence defined this way (by Cantor diagonalization) differs from every sequence g(n) in at least one entry, and so is not equal to g(n) for any natural number n. Thus, there is no bijection from the natural numbers to B, and B is uncountably infinite.


Induction and Recursion (Chapter 5). Question: What is the set {x | x is not an element of x}?

Proof by mathematical induction. Types: simple, strong, structural. Example of proof that the sum of the first n positive integers is n(n+1)/2. The schema for induction on the natural numbers, and the generalization to the set of all integers greater than or equal to some integer z. Example of proof that for all n greater than or equal to 4, 2^n is greater than or equal to n^2. The role of properties of "greater than or equal to" in such proofs (eg, the axiom of scaling invariance from Chapter 4.)

The ladder metaphor for induction: P(0) for "I can get on the first rung (0) of the ladder", and (forall n)(P(n) -> P(n+1)) for "If I can get on some rung of the ladder (n), I can get on the next rung of the ladder (n+1)". If both are true, then I can get on any rung of the ladder by getting on the first rung of the ladder and then the next, and the next, and the next, and so on, until I reach my destination rung.

Strong induction. The schema of strong induction, and comparison with simple induction. Example of using strong induction to prove that every integer greater than or equal to 2 is divisible by a prime.

Structural induction (Section 5.6 of the text.) Recursive definition of complete binary trees, and observation that the number of leaves of a complete binary tree is one more than the number of internal nodes of a complete binary tree. This observation is proved by structural induction in Section 5.6


Induction and Recursion (Chapter 5).

Recursive definition of the Fibonacci function F with domain and co-domain the natural numbers. The base cases are F(0) = 1 and F(1) = 1, and the recursive case is F(n) = F(n-1)+F(n-2) for all natural numbers greater than or equal to 2.

Proof by strong induction that for all natural numbers n, F(n) is greater than or equal to 2 raised to the power floor(n/2).

False "proof" that all natural numbers are both even and odd. In fact, if P(n) is the predicate that n is both even and odd, we can prove that for all natural numbers n, P(n) implies P(n+1). The reason this is not a correct proof by induction is because there is no base case.

Proof by structural induction that for all complete binary trees T, the number of leaves of T is one more than the number of internal nodes of T. This proof is also in Section 5.6.3 of the text, expressed more tersely.


Summation notation (Chapter 6) and Asymptotic notation (Chapter 7).

Expressions using summation notation for the sum of the first n positive integers, and for the sum of the first n odd positive integers. Analogy to a for loop to accumulate a sum of terms.

Recursive definition of the sum for i=m to n of f(i). The base case (when m is greater than n) is 0. Otherwise, it is f(m) plus the sum for i=(m+1) to n of f(i). Lemma 6.1.1, the linearity of summation. Application of Lemma 6.1.1 (linearity) to prove that the sum of the first n positive odd integers is n^2, using the fact that the sum of the first n positive integers is n(n+1)/2. (Two different proofs were offered.)

Example of double sum: summing up (i^2)*j for i = 1 to 3 and j = 1 to 4. Relation to row sums and column sums, and interchanging the two sums. The distributive law on steroids: (x1+x2)*(y1+y2+y3) = x1*y1+x1*y2+x1*y3+x2*y1+x2*y2+x2*y3, and its generalization to a sum of m values xi and n values yj.

Generalizations of summation notation: over sets of indices, to products, ands, ors, intersections and unions.

Asymptotic notation: motivations, definition of g(n) is in O(f(n)), proof that g(n) = n^2+9n-1 is in O(n^2). Upper bound ("big Oh"), lower bound ("big Omega"), and both ("big Theta"). Please see Section 7.1 for definitions, and the rest of Chapter 7 for more details.


Asymptotic notation (Chapter 7) and Number theory (Chapter 8).

Proof that g(n) = floor(n/3) is in Omega(n); note that the positive constant c may be a real number less than 1. Proof that n^3 is not in O(n^2). In this case, you must show that for all positive real numbers c and N, there exists a value of n greater than N such that |n^3| is greater than c|n^2|.

Number theory. An important application of number theory in computer science is the RSA encryption method. This depends on the fact that although testing a large integer for primality may be done efficiently (at least by a randomized algorithm), factoring a large integer appears not to have an efficient algorithm. Definitions of m divides n (notation: m|n), n is prime, n is composite. Statement and start of proof of "the division algorithm", Theorem 8.1.1.


Number theory (Chapter 8).

Proof by strong induction that for all positive integers m and all nonnegative integers n, there exist nonnegative integers q and r such that n = qm + r and r is less than m. (This is part of the proof of Theorem 8.1.1, "the division algorithm.") Notation of (n mod m) for the remainder r.

Definition of the greatest common divisor of natural numbers m and n, not both 0. Notation gcd(m,n). Algorithm for computing gcd(m,n) based on finding the prime factorizations of m and n ("SLOW"). Euclid's algorithm: gcd(m,n) = n if m is 0, and gcd((n mod m),m) otherwise ("FAST"). Proof that if d is a divisor of m and n, then d is a divisor of (n mod m) and m. Discussion of when we consider number theory algorithms "SLOW" or "FAST".

Attempt to find inputs to gcd(m,n) that result in a large number of recursive calls, working towards a proof that Euclid's gcd algorithm is in fact "FAST". Defining the Fibonacci numbers by F(0) = F(1) = 1, and F(n) = F(n-1)+F(n-2) for integers n greater than 1, we get the sequence: 1,1,2,3,5,8,13,21,34,..., for which we observed that each consecutive pair have a gcd of 1 (ie, are relatively prime or co-prime). Conjecture: for all natural numbers n, gcd(F(n),F(n+1)) = 1. Start of proof of this.


Number theory (Chapter 8).

Review of Theorem 8.1.1 (the Division Algorithm) and Euclid's algorithm for the gcd. Proof of partial correctness of Euclid's gcd algorithm, and proof of total correctness. Distinction of "bad" (polynomial in the value of n), "good" (polynomial in log(n), the number of digits of n) and "better" (polynomial in log(log(n))) algorithms on numbers n.

Modular arithmetic, congruence modulo m, addition, subtraction, multiplication and division (i.e., multiplicative inverses) modulo m. An extension of Euclid's algorithm that computes not only d = gcd(m,n), but also integers a and b such that d = a*m + b*n. Example: gcd(48,68) returning the answer 4 and integers -7 and 5 such that 4 = (-7)*48 + 5*(68). (See also Example 8.1 in the text.) If gcd(m,n) = 1, then the extended Euclid algorithm for the gcd returns integers a and b such that a*m + b*n = 1, so a*m is congruent to 1 modulo n (as well as, b*n is congruent to 1 modulo m.) This allows us to find the multiplicative inverse of n modulo m, provided gcd(m,n) =1.


Number theory (Chapter 8) concluded; Relations (Chapter 9)

An example of using the extended Euclidean algorithm to compute gcd(13,18) = 1 and integers 7 and -5 such that 1 = 7*13 + (-5)*18. This gives us that the multiplicative inverse of 13 modulo 18 is 7, and also that the multiplicative inverse of 18 (or 5) modulo 13 is (-5) (or 8). Check: (7*13 mod 18) = 1 and (8*5 mod 13) = 1.

Using multiplicative inverses to solve equations like ax+b=c(mod m). Several such equations: the Chinese Remainder Theorem (see the text). Application to using a vector of moduli to represent large integers. More material at the end of Chapter 8: Euler's theorem and a description of RSA encryption and decryption (for which you have all the ingredients.)

Relations. Binary relations from A to B, n-ary relations on A1,A2,...,An, and binary relations on A. Applications include relational databases (see SQL). Examples of binary relations on the natural numbers: (1) the set of all (m,n) such that m=n, (2) the set of all (m,n) such that m is less than n, (3) the set of all (m,n) such that m divides n. Representing finite relations using directed graphs and binary matrices.


Relations (Chapter 9) continued.

Review of definitions of binary relation from A to B and binary relation on A. Definitions of the composition of two relations and the inverse of a relation. Possible properties of binary relations on a set A: (1) reflexive, (2) symmetric, (3) antisymmetric, (4) transitive, and what they mean in terms of the directed graph and binary matrix representations of binary relations on A.

Examples of binary relations on A = {1,2,3,4,5,6,7}. R1 is the set of (a,b) such that a divides b, R2 is the set of (a,b) such that |a-b| is at most 1, and R3 is the set of (a,b) such that (a mod 3) = (b mod 3). All three relations are reflexive, R1 is not symmetric ((2,4) is in R1 but (4,2) is not in R1) while R2 and R3 are symmetric, R1 is antisymmetric while R2 and R3 are not antisymmetric ((3,4) is in R2 and (4,3) is in R2, but 3 is not equal to 4, similarly, (1,4) is in R3 and (4,1) is in R3, but 1 is not equal to 4), R1 and R3 are transitive, while R2 is not transitive ((2,3) is in R2 and (3,4) is in R2, but (2,4) is not in R2.)

Equivalence relations. A binary relation on A is an equivalence relation if it is reflexive, symmetric, and transitive. Of the three example relations in the preceding paragraph, R3 is an equivalence relation. Definitions of equivalence classes and a partition of a set. Theorem 9.4.1 gives three characterizations of an equivalence relation on A: (1) the definition above, (2) in terms of a partition of A, and (3) in terms of a function f:A -> B. (Note that R3 above is defined using the function f(x) = (x mod 3).)

Partial orders. A partial order on a set A is a binary relation on A that is reflexive, antisymmetric, and transitive. Examples: the relation R4 consisting of the set of all (a,b) such that a and b are positive integers less than or equal to 7 such that a is less than or equal to b. Similarly, R5, defined as for R4 substituting "greater than or equal to" for "less than or equal to". (The relation "less than" doesn't work, because it is not reflexive.) Another example: the relation R1 above.

If R is a partial order on A, then elements a and b a comparable in R iff (a,b) is in R or (b,a) is in R; otherwise, they are incomparable. The elements 2 and 6 are comparable in the partial order R1 because 2 divides 6. The elements 2 and 5 are incomparable in the partial order R1 because 2 does not divide 5 and 5 does not divide 2. A partial order is a total order if every pair of elements is comparable. Representing a finite partial order using a Hasse diagram.


Relations (Chapter 9) continued.

Review: partial orders and representation of finite partial orders using Hasse diagrams, comparable and incomparable elements of a partial order. Generalizing the infix "less than or equal to" binary predicate to represent an arbitrary partial order. A total order is a partial order in which every pair of elements is comparable. Minimal, minimum, maximal and maximum elements of a partial order.

Examples: the partial order on 201, 202, 223, 323, 365 induced by the pre-requisite structure on these CPSC courses, the partial order on the set of subsets of {1,2,3} defined by the relation "A is a subset of B". Theorem 9.5.2: Every partial order on a finite set has a total extension. The topological sort algorithm to find a total extension of a partial order by repeatedly removing a minimal element of the ordering. Remarks on implementing it efficiently. The need for Lemma 9.5.1: Every nonempty finite partially ordered set has a minimal element, and the beginning of its proof by induction.


Conclusion of Relations (Chapter 9); Start of Graphs (Chapter 10)

Careful inductive proof of Lemma 9.5.1: Every nonempty finite partially ordered set has a minimal element.

Discussion of applications of graphs. Graphs: vertices, edges, degree, drawing graphs. Two different drawings of the Petersen graph, and a proof that they are isomorphic. A drawing of a third graph with 10 vertices and 15 edges, and the question of whether it is isomorphic to the Petersen graph. Discussion of proofs of isomorphism and non-isomorphism of two graphs. Discussion of planar and non-planar graphs, Wagner's theorem that a graph is non-planar iff it contains K_{3,3} or K_5 as a minor. Discussion of finding a Hamiltonian path or cycle in a graph, and its relation to the question of whether P is equal to NP.


Graphs (Chapter 10), continued

Another drawing of the third graph from the previous lecture showing that it is planar, and therefore not isomorphic to the Petersen graph, which is non-planar.

Simple undirected graphs. G = (V,E), where V is a finite set of vertices, and E is a finite set of edges, each of which is a set of two distinct vertices from V. (Example of a non-simple undirected graph.) Convention of n = |V| and m = |E|. Definitions of endpoints, adjacent, incident, neighbors, degree. Example of G with V = {1,2,3,4,5,6} and E = {{1,2},{1,3},{2,3},{2,4},{4,5}}. Representation by adjacency lists or by adjacency matrix. Vertex degrees in the example: d(1) = 2, d(2) = 3, d(3) = 2, d(4) = 2, d(5) = 1, and d(6) = 0.

Paths and connectivity (Sect. 9.7). Definitions: path, length of a path, simple path, reachable, u connected to v, proof that reachability is an equivalence relation in an undirected graph; the equivalence classes are the connected components of the graph. These may be found in linear time using depth-first or breadth-first search of the graph. A graph is connected iff it has 1 connected component, i.e., every vertex is reachable from every other vertex. Example of graph and its connected components. (Aside on directed graphs, where reachability is not necessarily symmetric; the definition of strongly connected and strongly connected components, which also can be computed in linear time using an algorithm of Tarjan's.)

Definitions of simple cycle, cycle, and closed walk. A tree is a simple undirected graph that is connected and acyclic (that is, contains no cycles.) A cycle in G that includes every edge exactly once is an Eulerian cycle -- Euler and the seven bridges of Konigsberg. A necessary and sufficient condition for the existence of an Eulerian cycle is that the graph be connected and every vertex have even degree. A Hamiltonian cycle in G is a cycle that includes every vertex exactly once. There is no analogously simple criterion for the existence of a Hamiltonian cycle in a graph -- in fact, the problem of deciding whether a given graph G contains at least one Hamiltonian cycle is NP-complete.

Idea of Lemma 10.9.1: If there is a path from s to t in G then there is a simple path from s to t in G. Please see the text for an inductive proof of this.


Graphs (Chapter 10), continued; Counting (Chapter 11) begun

Review of some useful results about graphs from Section 10.9 of the text. Lemma 10.9.1: If there is a path from s to t in G then there is a simple path from s to t in G. (Intuition and a complete proof by strong induction on the length of the path was given.) Lemma 10.9.3 (The Handshaking Lemma): For any graph G = (V,E), the sum of the degrees of the vertices v in V is equal to twice the number of edges in E. (A sketch of the proof by simple induction on the number of edges of G was given.) Theorem 10.9.4: a graph is a tree if and only if there is exactly one simple path between any two distinct vertices. (Useful Lemma 10.9.5 -- a bit technical -- read it in the text.) Corollaries of Lemma 10.9.5 follow. Corollary 10.9.6: If G = (V,E) is a graph with |E| less than |V|-1, then G is not connected. Corollary 10.9.7: If G = (V,E) is a graph with |E| greater than |V|-1, then G contains a cycle. Very useful result on trees: Theorem 10.9.8: For any graph G = (V,E), any two of the following statements implies the third: (1) G is connected, (2) G is acyclic, and (3) |E| = |V|-1. Because a tree is a graph that is connected and acyclic, this implies that it must have |E| = |V|-1. But note also that a connected graph with |E| = |V|-1 must be a tree; also, an acyclic graph with |E| = |V|-1 must be a tree.

Definition of a spanning tree of a graph G and a method for removing edges from a connected graph G to get a spanning tree of G. (See also Theorem 10.9.9 and its proof.) Theorem 10.9.10 gives the necessary and sufficient conditions for a graph to contain an Eulerian cycle: G must be connected and every vertex of G must have even degree.

Counting. To count a finite set S, that is, to prove that |S| = n, we must show that there is a bijection between S and the finite ordinal [n] = {0,1,...,n-1}. This is fairly cumbersome, and much of Section 11.1 is taken up with principles for proving the cardinality of a set based on how it is constructed. For example: Theorem 11.1.1: If A and B are disjoint finite sets, then the cardinality of the union of A and B is the sum of the cardinalities of A and B. (The first part of a careful proof of this by using bijections between A and [m] and between B and [n] to construct a bijection between (A union B) and [m+n] was given. See the text for this and further examples of proofs of the basic counting principles.)


Counting (Chapter 11), continued

From the Sum Rule (Theorem 11.1.1) to the general formula for the cardinality of the union of two sets. Generalization to the union of three or more sets: the Inclusion/Exclusion Principle. The Pigeonhole Principle, and an application of it showing that any set of n+1 numbers from {1,2,...,2n} must contain two different numbers x and y such that x divides y. (Detail: the "landing function" is f(n) = m, where m is the largest odd integer that divides n.) Formulas for P(n,k), the number of ordered k-sequences of distinct elements drawn from a set of n elements, and C(n,k), the number of subsets (unordered) of k elements drawn from a set of n elements. The Binomial Theorem (finite case) and its relationship to the binomial coefficients C(n,k). Interpreting the special case of x=y=1, namely, the number of all the subsets of a set of n things (2^n) is equal to the sum from k=0 to n of the numbers of subsets of k things drawn from a set of n things (C(n,k)).


Counting (Chapter 11), concluded; Probability (Chapter 12) begun

Review: binomial coefficient, binomial theorem. Pascal's Triangle, statement and combinatorial proof of Pascal's Identity.

Probability: discussion of some applications in computer science (randomized algorithms, simulations, machine learning and data science.)

Concept of a (discrete, finite) probability space of outcomes, with a probability function Pr mapping outcomes to probabilities, real numbers between 0 and 1 (inclusive). Definition of an event as a subset of the set of outcomes, independence of two events, and extension of Pr to map events to probabilities by summing the probabilities of the outcomes in the event.

Example of a probability space consisting of two rolls of a 6-sided die (singular of "dice"), each outcome having a uniform probability of 1/36, and events in this space: the event A that the absolute value of the difference of the two rolls is 3 (which is the event A = {(1,4),(4,1),(2,5),(5,2),(3,6),(6,3)}) and the event B that the sum of the two rolls is a prime number (which is the event B = {(1,1),(1,2),(1,4),(1,6),(2,1),(2,3),(2,5), (3,2),(3,4),(4,1),(4,3),(5,2),(5,6),(6,1),(6,5)}.) Then Pr(A) = 1/6 and Pr(B) = 5/12. A and B are not independent, since if C is their intersection, then C = {(1,4),(4,1),(2,5),(5,2)} and Pr(C) = 1/9, which is not the product of Pr(A) and Pr(B).

Definition of conditional probability of A given B, denoted Pr(A|B). Question: given a hypothetical binary condition D, and a hypothetical binary test T for the condition, and a 2x2 table giving the values of Pr(T=1|D=1), Pr(T=1|D=0), Pr(T=0|D=1), and Pr(T=0|D=0), can we calculate the probability that the condition is present (D=1), given that the test is positive (T=1), that is, can we calculate Pr(D=1|T=1)?


Probability (Chapter 12)

Review of probability space, event, independence of events, conditioning (the probability of A given B, denoted Pr(A|B)) and its interpretation as forming a new probability space restricted to the outcomes in B. Proof that if A and B are independent events, then Pr(A|B) = Pr(A).

Revisiting the question from last time: no, in order to compute Pr(D=1|T=1), we also need to know the base rate, that is, Pr(D=1). Terminology: false positive rate (Pr(T=1|D=0)), false negative rate (Pr(T=0|D=1)), true positive rate (Pr(T=1|D=1)) (or "sensitivity" or "recall"), true negative rate (Pr(T=0|D=0)) (or "selectivity"). Derivation of Bayes' Rule in this case, that is, Pr(D=1|T=1) = x/(x+y), where x = Pr(T=1|D=1)*Pr(D=1), and y = Pr(T=1|D=0)*Pr(D=0). Note that Pr(T=1) = x+y. Thus, knowing the base rate (Pr(D=1), which also gives Pr(D=0) = 1-Pr(D=1)) allows us to go from information about the sensitivity and selectivity about the test (in the 2x2 table), to the probability of the condition (D) given the outcome of the test (T).

Definition and examples of a random variable, definition, intuition and examples of the expectation of a random variable, and statement of the linearity property of expectation, namely, that E(X+Y) = E(X)+E(Y) for *any* random variables X and Y, and E(c*X) = c*E(X) for *any* random variable X and constant c. Warning that this does not in general hold for product in place of sum.


Linear Algebra (Chapter 13) begun

(Be sure to read about the binomial distribution and the geometric distribution in Chapter 12.)

Discussion of linear algebra courses at Yale (MATH 222, 225) and applications of linear algebra in computer science and applied mathematics: graphics (CPSC 478,479), computer vision (CPSC 475,476), scientific computing (CPSC 440), data mining (CPSC 445), machine learning (STAT 365), natural language processing (CPSC 477 -- new in Spring 2017!), among others.

We consider n-dimensional real-valued vectors, with examples drawn from n=2. Addition of vectors, scalar multiples of vectors, linear combinations of vectors, span of a set of vectors.

Solving a set of equations to determine if the vector (3,2) is in the span of the vectors {(3,-1), (1,2)}. The solution was generalized to show that the span of these two vectors is all of R^2. Linear dependence and independence of a set of vectors; observation that the solution showed that the set {(3,-1), (1,2)} is in fact linearly independent. Definition of the basis of a vector space, and theorem (without proof) that bases all have the same number of elements, which is the dimension of the space. Dimensions of the linear subspaces of R^2 and R^3.

Other important quantities: the length of a vector (the Euclidean distance from the origin to its endpoint) and the angle between two vectors. Definition and notation for inner product (or dot product) of two vectors. (Next time: relation between the inner product of two vectors and the angle between them.)


Linear Algebra (Chapter 13) concluded

Inner (or dot) product of two vectors, and its relation to the length of a vector and the angle between two vectors. Review of the cosine function and the relevant trigonometric identity. Two vectors are orthogonal if their inner product is 0; example in 2 dimensions.

Solving a set of linear equations via Gaussian elimination (using row operations) and back-substitution; an example. Example of how this may break down, if the rows are not linearly independent. The operation count for Gaussian elimination is asymptotic to n^3/3. Row operations are reversible and preserve the span of the row vectors, so a basis for one set of row vectors gives a basis for another set obtained from it by row operations. Matrix-vector product and matrix-matrix product. Defining a linear transformation by f(x) = Ax, for a matrix A and vector x.

9 December 2016