Lecture notes for Math 55a: Honors Abstract Algebra (Fall 2010)

If you find a mistake, omission, etc., please let me know by e-mail.

The orange balls mark our current location in the course, and the current problem set.

Ceci n'est pas un Math 55a syllabus

Tony Feng's section and office hours:
Section: Wednesdays (starting September 8), 4-5 PM, Science Center 103b
Office hours: Thursdays (starting September 9), 7-8 PM, Science Center 310
revised: now Thursdays 8-9 and 9-10, both in Science Center 411

at least in the beginning of the linear algebra unit, we'll be following the Axler textbook closely enough that supplementary lecture notes should not be needed. Some important extensions/modifications to the treatment in Axler:
• [cf. Axler, p.3] Unless noted otherwise, F may be an arbitrary field, not only R or C. The most important fields other than those of real and complex numbers are the field Q of rational numbers, and the finite fields Z/pZ (p prime). Other examples are the field Q(i) of complex numbers with rational real and imaginary parts; more generally, Q(d1/2) for any nonsquare rational number d; the “p-adic numbers” Qp (p prime), of which we'll say more when we study topology next term; and more exotic finite fields such as the 9-element field (Z/3Z)(i). Here's a review of the axioms for fields, vector spaces, and related mathematical structures [corrected 7.ix.10: I somehow forgot to include the associative property among the vector space axioms )-:]
• [cf. Axler, p.22] We define the span of an arbitrary subset S of (or tuple in) a vector space V as follows: it is the set of all (finite) linear combinations a1v1 + … + anvn with each vi in S and each ai in F. This is still the smallest vector subspace of V containing S. In particular, if S is empty, its span is by definition {0}. We do not require that S be finite.
• Unlike Axler, we discuss “quotient vector spaces” (and even quotient modules) explicitly and introduce them early. If N is a submodule of the module M over some ring A, the quotient module M/N consists of all equivalence classes in M, where we define the equivalence relation “∼” to mean that mm' if and only if m−m' is contained in N. One must then check that the module operations inherited from M do give M/N the structure of a well-defined A-module. [“Well-defined” means that the operations do not depend on the choice of representative from an equivalence class.] A familiar example is the cyclic finite group Z/nZ (recall that abelian group=Z-module). In the case of vector spaces, the dimension of a quotient vector space V/U is called the codimension in V of its subspace U; if V is finite-dimensional, this codimension equals dim(V) − dim(U).
• Warning: in general the space F[X] (a.k.a. P(F)) of polynomials in X, and its subspaces Pn(F) of polynomials of degree at most n, might not be naturally identified with a subspace of the space FF of functions from F to itself. The problem is that two different polynomials may yield the same function. For example if F is the field of 2 elements the polynomial X2-X gives rise to the zero function. In general different polynomials can represent the same function if and only if F is finite — do you see why?
• In Axler's Theorem 2.10, The hypothesis that the spanning set be finite (implicit in Axler's use of “lists”) is not necessary as long as the vector space V is finite dimensional. Indeed let S be an arbitrary spanning set. Since V is finite dimensional, it has a finite spanning set S'. Write each element of S' as a linear combination of elements of S. These linear combinations involve only finitely many elements of S, so S has a finite subset S0 that spans S' and thus spans V. Now apply Axler's argument to find a subset of S0, and thus a fortiori of S, that is a basis of V.
• Here's an extreme example of how basic theorems about finite-dimensional vector spaces can become utterly false for finitely-generated modules: a module generated by just one element can have a submodule that is not finitely generated. Indeed, for any field F let A be the ring of polynomials in infinitely many variables Xj. As usual we can regard A as a module over itself, with a single generator 1. A submodule is then just an ideal of the ring. Choose the ideal I generated by all the Xj, which consists of all polynomials with constant coefficient equal 0. Then if there are infinitely many indices j then I is infinitely generated; indeed any generating set must be at least as large as the index set of j’s, so for every cardinal ℵ we can make a ring A with a singly-generated module (namely A itself) and with a submodule that cannot be generated by fewer than ℵ elements.
For a subtler example, consider the ring we might call “F[X1/2]”, consisting of F-linear combinations of monomials Xn/2k for arbitrary nonnegative integers n and k. Again let I be the ideal generated by the nonconstant monomials, which is not finitely generated, though there are generating sets that are “only” countably infinite. The new behavior involves the countable generating set {X1/2k | k≥0}: there is no minimal generating subset, because each X1/2k is a multiple of X1/2k' for any k'>k. Likewise for the ring generated by all monomials Xr with r a nonnegative rational number (or even all Xr with r a nonnegative real number).
• Suppose T : VW is a linear transformation. Axler's notation for the image of T was already becoming rather old-fashioned when he wrote his book; these days simply T(V) is common (and likewise for any function at all). The terminology “null space” (whether one or two words) for T -1(0) is also somewhat quaint; we usually say “kernel” and write “ker(T)” [and (LA)TeX already provides the command \ker to typeset this properly].
• Axler also unaccountably soft-pedals the important notion of duality. [corrected 18.ix.10: If W is the direct sum of U and V then the copy of U* in W* is the “annihilator” in W* of V, not of U.]
• About “Lemma 3.?”: some notes and warnings about the behavior of Hom(V,W) under (finite or infinite) direct sums.
• Here's a brief preview of abstract nonsense (a.k.a. diagram-chasing), and a diagram-chasing interpretation of quotients and duality.
• Axler proves the Fundamental Theorem of Algebra using complex analysis, which cannot be assumed in Math 55a (we'll get to it at the end of 55b). Here's a proof using the topological tools we'll develop at the start of 55b. (Axler gives the complex-analytic proof on page 67.) [corrected 24.ix.10 to fix a minor typo: when f-tilde is introduced, it is f-tilde that vanishes at zero, not f]
• We shall need some “eigenstuff” also in an infinite-dimensional setting, so will not assume that any vector space is (nonzero) finite dimensional unless we really must.
• If T is a linear operator on a vector space V, and U is an invariant subspace, then the quotient space V/U inherits an action of T. Moreover, the annihilator of U in V* is an invariant subspace for the action of the adjoint operator T* on V*. (Make sure you understand why both these claims hold.)
• Triangular matrices are intimately related with “flags”. A (complete) flag in a finite dimensional vector space V is a sequence of subspaces {0}=V0, V1, V2, …, Vn=V, with each Vi of dimension i and containing Vi-1. A basis v1, v2, …, vn determines a flag: Vi is the span of the first i basis vectors. Another basis w1, w2, …, wn determines the same flag if and only if each wi is a linear combination of v1, v2, …, vi (necessarily with nonzero vi coefficient). The standard flag in Fn is the flag obtained in this way from the standard basis of unit vectors e1, e2, …, en. The punchline is that, just as a diagonal matrix is one that respects the standard basis (equivalently, the associated decomposition of V as a direct sum of 1-dimensional subspaces), an upper-triangular matrix is one that respects the standard flag. Note that the i-th diagonal entry of a triangular matrix gives the action on the one-dimensional quotient space Vi/Vi−1 (each i=1,…,n).
Less surprising than the absence of quotients and duality in Axler is the lack of tensor algebra. That won't stop us in Math 55, though. Here's an introduction [As you might guess from \oplus, the TeXism for the tensor-product symbol is \otimes.]
corrected 30.ix.10 and 1.x.10 to fix a couple of minor typos and improve the presentation of VF F'
• One of many applications is the trace of an operator on a finite dimensional F-vector space V. This is a linear map from Hom(V,V) to F. We can define it simply as the composition of two maps: our identification of Hom(V,V) with the tensor product of V* and V, and the natural map from this tensor product to F coming from the bilinear map taking (v*,v) to v*(v). We'll see that this is the same as the classical definition: the trace of T is the sum of the diagonal entries of the matrix of T with respect to any basis. The coordinate-independent construction via tensor algebra explains why the trace does not change under change of basis. (The invariance can also be proved by checking explicitly that AB and BA have the same trace for any square matrices A, B of the same size.)
• Here are some basic definitions and facts about general norms on real and complex vector spaces.
• Just as we can study bilinear symmetric forms on a vector space over any field, not just R, we can study sesquilinear conjugate-symmetric forms on a vector space over any field with a conjugation, not just C. Here a “conjugation” on a field F is a field automorphism σ:FF such that σ is not the identity but σ2 is (that is, σ is an involution). Given a basis {vi} for F, a sesquilinear form ⟨.,.⟩ on F is determined by the field elements ai,j = ⟨vi,vj⟩, and is conjugate-symmetric if and only if aj,i = σ(ai,j) for all i,j. Note that the “diagonal entries” ai,i — and more generally ⟨v,v⟩ for any v in V — must be elements of the subfield of F fixed by σ.
• “Sylvester's Law of Inertia” states that for a nondegenerate pairing on a finite-dimensional vector space V/F, where either F=R and the pairing is bilinear and symmetric, or F=C and the pairing is sesquilinear and conjugate-symmetric, the counts of positive and negative inner products for an orthogonal basis constitute an invariant of the pairing and do not depend on the choice of orthogonal basis. (This invariant is known as the “signature” of the pairing.) The key trick in proving this result is as follows. Suppose V is the orthogonal direct sum of subspaces U1, U2 for which the pairing is positive definite on U1 and negative definite on U2. Then any subspace W of V on which the pairing is positive definite has dimension no greater than dim(U1). Proof: On the intersection of W with U2, the pairing is both positive and negative definite; hence that subspace is {0}. The claim follows by a dimension count, and we quickly deduce Sylvester's Law.
• Over any field not of characteristic 2, we know that for any non-degenerate symmetric pairing on a finite-dimensional vector space there is an orthogonal basis, or equivalently a choice of basis such that the pairing is (x,y)=sumi(ai xi yi) for some nonzero scalars ai. But in general it can be quite hard to decide whether two different collections of ai yield isomorphic pairings. Even over Q the answer is already tricky in dimensions 2 and 3, and I don't think it's known in a vector space of arbitrary dimension. Over a finite field of odd size there are always exactly two possibilities, as we'll see in a few weeks.
• A regular graph of degree d is a Moore graph of girth 5 if any two different vertices are linked by a unique path of length at most 2. Such a graph necessarily has n = 1 + d + d(d−1) = d2 + 1 vertices. Let A be the adjacency matrix, and 1 the all-ones vector. Then (because each vertex has degree d) 1 is a d-eigenvector of A. We have (1 + A + A2) v = d v + ⟨v, 11 for all v (proof: check on unit vectors and use linearity). Thus A takes the orthogonal complement of R·1 to itself and satisfies (1 + A + A2) = d on that space. Since this quadratic equation has distinct roots m and −1−m for some m > 0 (namely the positive root of (1 + m + m2) = d), it follows that the orthogonal complement of R·1 is the direct sum of the corresponding eigenspaces. Let d1 and d2 be their dimensions. These sum to n−1 = d2, and satisfy md1 + (−1−m)d2 + d = 0 because the matrix A has trace zero. This lets us solve for d1 and d2. in particular we find that their difference is (2dd2) / (2m+1). Since that's an integer, either d=2 (giving the pentagon graph) or m is an integer. Substituting m2 + m + 1 for d, we find that 16(d1d2) is an integer plus 15/(2m+1), whence m is one of 0, 1, 2, or 7. The first of these is impossible, and the others give d = 3, 7, or 57 as claimed.

It is clear that the pentagon is the unique Moore graph of girth 5, and degree 2, and it is not hard to check that the Petersen graph is the unique example with d = 3. For d = 7 there is again a unique graph up to an interesting group of automorphisms; see this proof outline, which also lets one count the automorphisms of this graph. There is no spectral graph theory there, but this approach is suggested by the inequality you'll prove in the fifth problem of problem set 7: if the 50 vertices are divided into two equal subsets, what's the smallest possible number of edges between them?

For more of this kind of application of linear algebra to combinatorics, see for example Cameron and van Lint's text Designs, Graphs, Codes and their Links (London Math. Society, 1991, reprinted 1996).

• All of Chapter 8 works over an arbitrary algebraically closed field, not only over C (except for the minor point about extracting square roots, which breaks down in characteristic 2); and the first section (“Generalized Eigenvalues”) works over any field.
• We don't stop at Corollary 8.8: let T be any operator on a vector space V over a field F, not assumed algebraically closed. If V is finite-dimensional, then The Following Are Equivalent:
(1) There exists a nonnegative integer k such that Tk=0;
(2) For any vector v, there exists a nonnegative integer k such that Tkv=0;
(3) Tn=0, where n=dim(V).
Note that (1) and (2) make no mention of the dimension, but are still not equivalent for operators on infinite-dimensional spaces. We readily deduce the further equivalent conditions:
(4) There exists a basis for V for which T has an upper-triangular matrix with every diagonal entry equal zero;
(5) Every upper-triangular matrix for T has zeros on the diagonal, and there exists at least one upper-triangular matrix for T.
Recall that the second part of (5) is automatic if F is algebraically closed.
• The space of generalized 0-eigenvectors (the maximal subspace on which T is nilpotent) is sometimes called the nilspace of T. It is an invariant subspace. When V is finite dimensional, V is the direct sum of the nilspace and another invariant subspace V', consisting of the intersection of the subspaces Tk(V) as k ranges over all positive integers. See Exercise 8.11; this can be used to quickly prove Theorem 8.23 and consequences such as Cayley-Hamilton (Theorem 8.20).
• The dimension of the space of generalized c-eigenvalues (i.e., of the nilspace of T−cI) is usually called the algebraic multiplicity of c (since it's the multiplicity of c as a root of the characteristic polynomial of T), to distinguish it from the “geometric multiplicity” which is the dimension of ker(T−cI).
We'll not cover Chapter 9, which seems to exist mainly to prove that it is possible to construct the characteristic polynomial over R without introducing determinants. We'll proceed to Chapter 10, which covers determinants, but…

We'll define the determinant of an operator T on a finite dimensional space V as follows: T induces a linear operator T' on the top exterior power of V; this exterior power is one-dimensional, so an operator on it is multiplication by some scalar; det(T) is by definition the scalar corresponding to T'. The “top exterior power” is a subspace of the “exterior algebra” of V, which is the quotient of the tensor algebra by the ideal generated by {v*v: v in V}. We'll still have to construct the sign homomorphism from the symmetry group of order dim(V) to {1,−1} to make sure that this exterior algebra is as large as we expect it to be, and that in particular that the (dim(V))-th exterior power has dimension 1 rather than zero.

Interlude: normal subgroups; short exact sequences in the context of groups: A subgroup H of G is normal (satisfies H = g−1Hg for all g in G) iff H is the kernel of some group homomorphism from G iff the injection HG fits into a short exact sequence {1} → HGQ → {1}, in which case Q is the quotient group G/H. [The notation {1} for the one-element (“trivial”) group is usually abbreviated to plain 1, as in 1 → HGQ → 1.] This is not in Axler but can be found in any introductory text in abstract algebra; see for instance Artin, Chapter 2, section 10. Examples: 1 → AnSn → {±1} → 1; also, the determinant homomorphism GLn(F) → F* gives the short exact sesquence 1 → SLn(F) → GLn(F) → F* → 1, and this works even if F is just a commutative ring with unit as long as F* is understood as the group of invertible elements of F — for example, Z* = {±1}.

Some more tidbits about exterior algebra:

• If w, w' are elements of the m-th and m'-th exterior powers of V, then ww'=(−1)mm'w'w; that is, w and w' commute unless m and m' are both odd in which case they anticommute.
• If m+m'=n=dim(V) then the natural pairing from the m-th and m'-th exterior powers to the n-th is nondegenerate, and so identifies the m'-th exterior power canonically with the dual of the m-th tensored with the top (n-th) exterior power.
• In particular, if m=1, and T is any invertible operator on V, then we find that the induced action of T on the (n−1)st exterior power is the same as its action on V* multiplied by det(T). This yields the formula connecting the inverse and cofactor matrix of an invertible matrix (a formula which you may also know in the guise of “Cramer's rule”).
• For each m there is a natural non-degenerate pairing between the m-th exterior powers of V and V*, which identifies these exterior powers with each other's dual.
More will be said about exterior algebra when differential forms appear in Math 55b.

We'll also show that a symmetric (or Hermitian) matrix is positive definite iff all its eigenvalues are positive iff it has positive principal minors (the “principal minors” are the determinants of the square submatrices of all orders containing the (1,1) entry). More generally we'll show that the eigenvalue signs determine the signature, as does the sequence of signs of principal minors if they are all nonzero. More precisely: an invertible symmetric/Hermitian matrix has signature (r,s) where r is the number of positive eigenvalues and s is the number of negative eigenvalues; if its principal minors are all nonzero then r is the number of j in {1,2,…,n} such that the j-th and (j−1)-st minors have the same sign, and s is the number of j in that range such that the j-th and (j−1)-st minors have opposite sign [for j=1 we always count the “zeroth minor” as being the positive number 1]. This follows inductively from the fact that the determinant has sign (−1)s and the signature (r',s') of the restriction of a pairing to a subspace has r' ≤ r and s' ≤ s.

For positive definiteness, we have the two further equivalent conditions: the symmetric (or Hermitian) matrix A=(aij) is positive definite iff there is a basis (vi) of Fn such that aij = ⟨vi,vj for all i,j, and iff there is an invertible matrix B such that A=B*B. For example, the matrix with entries 1/(i+j−1) (“Hilbert matrix”) is positive-definite, because it is the matrix of inner products (integrals on [0,1]) of the basis 1,x,x2,…,xn−1 for the polynomials of degree <n. See the 10th problem set for a calculus-free proof of the positivity of the Hilbert matrix, and an evaluation of its determinant.
For any matrix B (not necessarily invertible or even square) with columns vi, the matrix B*B with entries aij = ⟨vi,vj is known as the Gram matrix of the columns of B. It is invertible iff those columns are linearly independent. If we add an (n+1)-st vector, the determinant of the Gram matrix increases by a factor equal to the squared distance between this vector and the span of the columns of B. You'll use this too in PS10.

Here's a brief introduction to field algebra and Galois theory.

Our source for representation theory of finite groups (on finite-dimensional vector spaces over C) will be Artin's Algebra, Chapter 9. We'll omit sections 3 and 10 (which require not just topology and calculus but also, at least for §3, some material beyond 55b to do properly, namely the construction of Haar measures); also we won't spend much time on §7, which works out in detail the representation theory of a specific group that Artin calls I (the icosahedral group, a.k.a. A5). There are many other sources for this material, some of which take a somewhat different point of view via the “group algebra” C[G] of a finite group G (a.k.a.\ the algebra of functions on G under convolution). See for instance Chapter 1 of Representation Theory by Fulton and Harris (mentioned in class). A canonical treatment of representations of finite groups is Serre's Linear Representations of Finite Groups, which is the only entry for this chapter in the list of “Suggestions for Further Reading” at the end of Artin's book (see p.604).

While we'll work almost exclusively over C, most of the results work equally well (though with somewhat different proofs) over any field F that contains the roots of unity of order #(G), as long as the characteristic of F is not a factor of #(G). Without roots of unity, many more results are different, but there is still a reasonably satisfactory theory available. Dropping the characteristic condition leads to much trickier territory, e.g. even Maschke's theorem (every finite-dimensional representation is a direct sum of irreducibles) fails; some natural problems are still unsolved!

Here's an alternative viewpoint on representations of a finite group G (not in Artin, though you can find it elsewhere, e.g. Fulton-Harris pages 36ff.): a representation of G over a field F is equivalent to a module for the group ring F[G]. The group ring is an associative F-algebra (commutative iff G is commutative) that consists of the formal F-linear combinations of group elements. This means that F[G] is FG as an F-vector space, and the algebra structure is defined by setting eg1eg2 = eg1g2 for all g1, g2 in G, together with the F-bilinearity of the product. This means that if we identify elements of the group ring with functions GF then the multiplication rule is (f1 * f2)(g) = Σg1g2=g f1(g1) f2(g2) — yes, it's convolution again. To identify an F[G]-module with a representation, use the action of F to define the vector space structure, and let ρ(g) act by multipliction by the unit vector eg. In particular, the regular representation is F[G] regarded in the usual way as a module over itself. If we identify the image of this representation with certain permutation matrices of order #(G), we get an explicit model of F[G] as a subalgebra of the algebra of square matrices the same order. For example, if G = Z/nZ we recover the algebra of circulant matrices of order n.

Just as the regular representation is F[G] as a module over itself, a subrepresentation of the regular representation is an ideal of F[G]. The fundamental theorem of representation theory can be stated as an isomorphism between F[G] and the direct sum of algebras End(V) with V ranging over all the irreducible representations of G. Comparing dimensions gives the formula #(G) = ΣV dim(V)2. The F-linear isomorphism from F[G] and the direct sum of the End(V) is a generalization of the discrete Fourier transform (DFT) to noncommutative groups [check that if G is commutative we recover the DFT of Problem Set 10].

In case we haven't officially said it yet: the unitary group Un that figures prominently in Artin's treatment (see the start of 9.2, p.310) is the subgroup of GLn(C) that acts by isometries on Cn with its usual inner-product norm; that is, matrices whose inverse equals their own Hermitian transpose. The unitary group U(V) = U(V, ⟨.,.⟩) of a complex inner-product space V = (V, ⟨.,.⟩) is defined in much the same way; if V is finite-dimensional then any choice of orthonormal basis identifies U(V) with Un.

Artin derives the diagonalizability of ρ(g) as Corollary 2.3 (p.310) of unitarity plus the Spectral Theorem for normal operators. This is not necessary because we already know that a linear transformation that satisfies a polynomial equation with distinct roots is diagonalizable, and ρ(g) satisfies such an equation, namely Xn−1=0 where n is the order of g. Note that this argument applies more generally to a representation over any field in which Xn−1=0 has n distinct roots. Again this leads naturally to the condition that the characteristic not be a factor of #(G), i.e. that #(G) is nonzero in F: this condition guarantees that also n≠0 in F, because n is always a factor of #(G) [a corollary of Lagrange's theorem that the order of any subgroup of G is a factor of #(G), see e.g. Artin, (6.11) and (6.12) in Chapter 2, p.58].

Let T be the character table of a group of N elements, and let T* be its Hermitian transpose. The first part two parts of Theorem 5.9 in Chapter 9 state in effect the identity TDT* = NI where D is the diagonal matrix whose diagonal entries are the sizes of the conjugacy classes of G. But then NI is also DT*T. This means that the columns of T are also orthogonal, and the column indexed by conjugacy class C has inner product N/#(C) with itself. [This is an adaptation of the familiar but still remarkable fact that the rows of a square matrix are orthonormal iff the columns are orthonormal.] That is, for any group elements g, h we have Σχ χ(g)χ(h) = 0 (the sum extending over the irrducible characters χ), unless g and h are conjugate, in which case χ(g) = χ(h) for all χ and we get Σχ|χ(g)|2 = N/#(C) where C is the conjugacy class of g. In particular, taking g = id we get C = {id} and thus recover the fact that N is the sum of the squares of the dimensions of the irreducible representations.

Let G be a group of N elements, and (V,ρ) be any finite-dimensional representation over a field in which N is nonzero. Let P be the operator N−1 ΣgG ρ(g). This is a G-endomorphism of V, i.e. it commutes with each ρ(g), whence it follows that P is an idempotent: it satisfies P2 = P. We already know (PS9 1(i)) that this makes V the direct sum of the eigenspaces V0 and V1 of P, which are respectively the kernel and image of P. It follows that P has trace dim(V1). This is what lets us begin to link characters with irreducibles: on the one hand this trace is N−1 ΣgG χ(g) = ⟨χ, 1&rang where χ is the character of V. On the other hand V1 is the dimension of the G-invariant subspace VG of V (which we can also canonically identify with HomG(F,V) where F is regarded as the trivial representation).

To get from this to the dimension of HomG(V,V’) for any representations V,V, recall the identification of Hom(V,V’) (ignoring the G-action) with the tensor product V*V. Artin doesn't describe the action of G on V*. We know how to get from each ρ(g) a linear operator (ρ(g))* on V*, but these do not compose in the right order (unless G is abelian): we have (ρ(g))*(ρ(h))* = (ρ(hg))*. So to get a representation ρ* of G on V* we must define ρ*(g) = (ρ(g−1))* for all g. This yields the dual representation (sometimes called the “contragredient representation”) (V**) of (V,ρ); its character χ* is the complex conjugate of χ. This gives Hom(V,V’) = V*V the structure of a representation of G, and we can check that the invariant subspace is precisely HomG(V,V’) [which also justifies Artin's term “G-invariant homomorphism” for an element of HomG(V,V’)]. We then deduce that the dimension of this space is ⟨χ*χ', 1⟩ = ⟨χ', χ⟩, and since that's a nonnegative integer — and thus a fortiori real — it is also equal to ⟨χ, χ'⟩ as claimed.

NB Artin's definition of the inner product on class functions, or indeed on any functions on G (Chapter 9, (5.8), page 318), makes this inner product semilinear in the first variable. This is consistent with his treatment elsewhere in the book (see Chapter 4, (4.4), page 250), and is occasionally more convenient — e.g. here we wouldn't have had to finish by observing that the inner product is real and thus its own conjugate — but is rarer than the convention that we (following Axler) have been using. We have to tweak our development accordingly.

The operators introduced by Artin towards the end of his proof of (most of) Theorem 5.9 have further uses when φ is the character of an irreducible representation V. Indeed consider Pφ := N−1 ΣgG φ*(g) ρ(g) acting on any representation W. Since φ is a class function we again find that Pφ is a G-endomorphism of W. Thus (by Schur) if W is irreducible then Pφ is a multiple of the identity. The correct multiple can then be identified by computing the trace of Pφ, and that's ⟨χ, φ⟩ where χ is the character of W (whether or not W is irreducible). Using the orthogonality relations we deduce that if W is irreducible then Pφ = 0, unless W is isomorphic with V, in which case Pφ = 1 / dim(V). It follows that if W is the direct sum of irreducibles Vi then dim(V) · Pφ is the projection to the direct sub-sum of those Vi that are isomorphic with V. In particular, this sub-sum is independent of the decomposition; it is often known as the V-isotypic subspace of W. Examples: if V is the trivial representation then the V-isotypic subspace is the fixed subspace we called VG, and indeed in this case Pφ is the average of ρ(g) over group elements g. If G is the group of two elements and V its nontrivial representation then the isotypic subspace is the (−1)-eigenspace for the non-identity element g of G, and indeed the associated projection Pφ is the familiar (1−ρ(g))/2, just as projection to the (+1)-eigenspace is the averaging operator (1+ρ(g))/2.

If you know about integral closures and related ideas (see for instance the beginning of the summary of field algebra) then you can also use Pφ to demonstrate the last part of Theorem 5.9, which Artin punts on proving: the dimension of every irreducible representation V is a factor of N = #(G). Proof: N/dim(V) is an eigenvalue of the action on the regular representation of N·Pφ = ΣgG φ*(g) ρ(g); since each of the numbers φ*(g) is an algebraic integer (it's the sum of roots of unity), it follows that so is N/dim(V), and since that quotient is rational we conclude that N / dim(V) ∈ Z as claimed.

Our final topic for Math 55a is the Sylow theorems, covered by Artin in Chapter 6, Section 4 (pages 205–208). Corollary 4.3 (a finite group G contains an element of order p for every prime factor p of |G|) is a theorem of Cauchy that predates Sylow; we'll start by giving its combinatorial proof, which will introduce the technique that we (following Artin) will use to prove the Sylow theorems. Note that Artin's statement of the “second Sylow theorem” ((4.6) on page 206) is not the usual terminology, which regards Corollary 4.7 as the second Sylow theorem and (4.6) as a step in one proof of this theorem.

The proof Artin gives for Sylow I can be extended to give the “1 mod p part of Sylow III. Artin shows that the combinatorial coefficient bin(pem, pe) he calls N is not a multiple of p; extending the same argument shows that N is congruent to m mod p (the first factor is m mod p, and each of the other factors is 1 mod p). [Simpler yet: work in charcteristic p, and apply e times the identity (1+X)p = 1 + Xp to get (1+X)pem = (1 + Xpe)m; now look at the Xpe coefficient.] Since m is also the number of orbits of each Sylow subgroup on the pem-element subsets of G, it follows that the number of Sylow subgroups is congruent mod p to m/m = 1.

Artin's treatment of Sylow II may be easier to read if you just replace each instance of s by H (recall that s was introduced as the coset 1H of  H).

First problem set / Linear Algebra I: vector space basics; an introduction to convolution rings
corrected 1.ix.10 (a for an in 10(i))

Second problem set / Linear Algebra II: dimension of vector spaces; torsion groups/modules and divisible groups

Third problem set / Linear Algebra III: Countable vs. uncountable dimension of vector spaces; linear transformations and duality
NB: The “trickiness” of 2ii has nothing to do with subtleties of infinite cardinals, the Axiom of Choice, etc.; all you should need about countability (beyond the definition) is that a countable union of countable sets is countable (as in Cantor's proof of the countability of the rationals).
For Problem 9, the dual is the product (not sum!) of the Vi. If F is finite (or countably infinite), I is countably infinite, and each Vi has positive finite dimension, then the direct sum V is countable, but its dual V* is not, so V and V* cannot be isomorphic (see also problem 2i).
For 2ii, the sequences px := (1, x, x2, x3, &hellip, xn, &hellip) in F are linearly independent. Using “eigenstuff” this is easy to see because px is an x-eigenvector for the left-shift operator, and eigenvectors with different eigenvalues are linearly independent.
Here's Curt McMullen's proof, which does not depend on the cardinality of F.

Fourth problem set / Linear Algebra IV: Eigenstuff, and a projective overture (in which “U” denotes the annihilator of U in PV*)

Fifth problem set / Linear Algebra V: Tensors, eigenstuff cont'd, and a bit on inner products
Problem 5i can be less confusing if you start with compositions of maps UVW involving three different vector spaces, and only then set all of them equal V. Thus if A is in Hom(V,W) then the map taking X to AX goes from Hom(U,V) to Hom(U,W), i.e. from U*V to U*W, so it had better be IA. Likewise for B is in Hom(U,V) the map taking X to XB goes from V*W to U*W, so it ought to be B*I — not BI because we need a map from V* to U*. (Fortunately B and its adjoint operator B* have the same spectrum, so even if you made this mistake you could still get the rest of the problem right.) Of course, once you've surmised the answer IA + B*I, you still have to prove it; by linearity, it's enough to check that the formulas agree on pure tensors, or in convenient coordinates if you prefer — which is indeed a special case because the usual coordinates on U*V come from a basis of pure tensors, namely tensors of vectors in the dual basis for U* with the basis for V.

Sixth problem set / Linear Algebra VI: Pairings and inner products, cont'd

Seventh problem set / Linear Algebra VII: More about the spectrum, including spectral graph theory; symplectic structures
For 2i, changing each coordinate of a λ-eigenvector v to its absolute value preserves v, v and does not decrease Av, v. Thus by problem 1 the new vector is also a λ-eigenvector, and we get such an eigenvector with nonnegative entries. If μ is a negative eigenvalue, the same construction shows that the largest eigenvalue is at least |μ|.
• For 2ii, show that if vi > 0 but vj = 0 and aij > 0 then we can increase the “Rayleigh quotient” Av, v/v, v by making vj positive but sufficiently small. So if not all vi are positive we can let I be the set of indices i of positive coordinates, and J its complement, giving a disconnection of A (note that I cannot be empty because then v is the zero vector). In the connected case, dimension 1 follows because else some eigenvector would have both positive and negative entries.
• Finally, for 2iii if there is a (−λ)-eigenvector v then changing each coordinate to its absolute value produces a (+λ)-eigenvector; then if we let I be the set of indices of positive coordinates of v then for every positive aij exactly one of i,j is in I, and conversely if there is such a subset I of {1,2,…,n} then changing each coordinate of a (+λ)-eigenvector to its negative produces a (−λ)-eigenvector.
• NB The term “connected”, and the connection with Problem 5 (see the next paragraph), should suggest forming a graph with n vertices and an edge for each i,j such that aij > 0. The graph is connected iff the matrix is “connected”, and then the condition of part iii is that this graph be bipartite. This generalizes the corresponding results for the adjacency matrix of the same graph. Note that indeed the hypercube graph of Problem 5 is bipartite and its adjacency matrix has maximal eigenvalue n and minimal eigenvalue −n.
• There is an analogous result for matrices with nonnegative entries that are not required to be symmetric: there is a real eigenvalue λ such that |μ| ≤ λ for all eigenvalues μ, even complex ones (which may exist in this generality). There are also equality conditions in terms of connectedness of an associated graph, which in this setting must be taken to be a directed graph. I don't know as slick a proof of this result, at least not without using more of the topological tools we'll develop in Math 55b.
For 5ii, let x be the vector whose v coordinate is +1 if v is in S and −1 if not. Then x is orthogonal to the all-1 vector, which generates the eigenspace for eigenvalue n. Since the second-largest eigenvalue is n−2, it follows that Ax, x⟩ ≤ (n−2) ⟨x, x with equality iff Ax = (n−2) x. This yields the desired bound, with equality iff each vertex in S has exactly one neighbor outside S, which in turn is soon seen to be equivalent to the condition that S is one of the 2n subsets on which one of the n coordinates is constant.

Eighth problem set / Linear Algebra VIII: Nilpotent operators and related topics; a natural inner product strucure on operators on a finite-dimensional space; exterior algebra; recreational applications of the sign homomorphism on Sn
corrected 29.x.10 (unipotent operator in 3, not “unipotent vector”)

Ninth problem set / Linear Algebra IX: More on exterior algebra and determinants
corrected to replace duplicated Problem 4 (ouch) and fix the Axler quote
corrected again (ouch-prime) to fix Problem 2: ⟨ω,ω⟩ = 0, not ⟨ω,ω'⟩ = 0
For problem 2, certainly if ω = v1 ^ v2 then ω ^ ω = v1 ^ v2 ^ v1 ^ v2 = − v1 ^ v1 ^ v2 ^ v2 = 0. The reverse implication can be proved by direct calculation, but it's easier to use the fact that if ω is not a pure tensor then it is of the form v1 ^ v2 + v3 ^ v4 for some basis {v1, v2, v3, v4}, and then ω ^ ω = 2 v1 ^ v2 ^ v3 ^ v4 is nonzero.
For problem 3, choose any basis B for V; let ψ be the wedge product of all 4n basis vectors, and use for W a basis of pure exterior products of 2n-element subsets of B. Then ⟨ω,ω'⟩ = 0 for all basis elements ω,ω' unless ω and ω' come from complementary subsets of B, in which case ⟨ω,&omega'⟩ = ±1. So W is the orthogonal direct sum of r := dim(W)/2 two-dimensional subspaces, each with signature (1,1). Hence W has signature (r, r). In each subsapce, ω + ω' and ω − ω' constitute an orthogonal basis, and the basis vectors have inner products 2 and −2 with themselves. So we can take for W+ the span of the r positive basis vectors, and for W the span of the r negative ones.

Tenth problem set / Linear Algebra X: Signatures and real polynomials; determinants and distances; Pontrjagin duality
corrected: in 3(ii), each xi and yj must be positive, not just greater than −1/2 [I was thinking of the application to Müntz for which the matrix entries are of the form tx,ty⟩ = 1/(x+y+1), not 1/(x+y)]
also, corrected 11.xi: in 1(i) the
Pj must be pairwise coprime, not merely distinct (and then there's no need to require P and each Pj to be monic)
and, corrected 23.xi (sorry!): in problem 3, the xi must already be assumed distinct in part (ii), else the matrix is only positive semidefinite.
For 1(iii), the point is to find a polynomial you can compute easily from the coefficients of P; it's no fair to just write something like “Πir(xi) Πjs(x+j), where P has r positive and s negative roots”: that's begging the question of determining r and s. Hint: the polynomial you construct need not have the same degree as P. For 8(ii), you're looking for a way to efficiently combine black boxes that compute DFT's on H and G/H into a technique for computing DFT's on G (so that one can recursively construct such a box if G is (say) a cyclic group of order 2r from DFT's for tiny groups like the 2-element group). This should be straightforward if G is simply the direct product of H and G/H. But what if (as suggested by the introductory paragraph) G is a cyclic group of order 2r and H is its subgroup of order 2, or the subgroup of order 2r−1 and index 2?

Eleventh and final problem set: Representations of finite groups
corrected overnight: in problem 3, Sk is not the same as kS
Problem 3 shows one approach and a few typical applications for a result often known as “Burnside's Lemma” [though it was already well-known in Burnside's day, going back at least to Cauchy (1845) according to this Wikipage]. I thank Lewis Stiller for pointing out the connection with representation theory some years back.
The construction illustrated by Problem 4(iii) is the ”Molien series” or ”Hilbert-Molien series” of the ring of invariants of a representation of a finite group. (See for example this Wikipage; as of this writing, the example there happens to be the same as the one in this problem.) In general the denominator can always be taken to be a product of factors 1-td, but it's rare for the numerator to be simple, let alone as simple as just 1 when we're dealing with the very special case of a complex reflection group.
For problem 7(iii), there are some alternative solutions for this particular case, but the intended solution was to show more generally: if V is a nonzero finite-dimensional representation, over some field subfield F of C, of a finite group G, then V is irreducible if and only if EndG(V) is a division algebra. Note that this is easy for F=C by Schur's Lemma: if V is the direct sum over j of isotypics Vjmj, then EndG(V) is the direct sum of matrix algebras GLmj(C), which is a division algebra if and only if mj=1 for one j and mj=0 for all other j. To do it in general, note that if some operator in EndG(V) is invertible then its inverse is also a G-endomorphism of V. Hence if EndG(V) is not a division algebra, there is some nonzero G-endomorphism T of V, and then the kernel and image of T are subrepresentations of V other than {0} and V itself. Conversely, if V is not reducible then it is a direct sum V1V2 of two nonzero representations, and the projection to either factor is a nonzero G-endomorphism of V that is not invertible.

The FINAL EXAM is now on my office door.
Correction (argh...): The n-by-n matrix A of the introductory paragraph of the first problem should have been called An. Then in part (i) take n=2m to get A2m [NB not A2n as I wrote earlier].
Another typo correction: in 3(iii), “G has a representation (V,ρ) whose associated character ρ takes the values…&rdquo should of course be “whose associated character χ takes&rdquo etc.