Sunday, August 12, 2007

Newton's Identities: Euler's Generalization

As mentioned earlier, Sir Isaac Newton's main purpose in coming up with his "identities" (see here for introduction to Newton's Identities) was to find a formula for determining whether two cubic equations possessed a common root.

Leonhard Euler was able to find a very general solution for finding this formula for any two equations of any degree. This general solution is today known as a resultant.

In today's blog, I will show the general solution for the resultant and then show that this equation has the properties that it equals 0 if and only if two equations have at least one solution in common.

The content in today's blog is taken from Galois' Theory of Algebraic Equations by Jean-Pierre Tignol.

Definition 1: Resultant

Let P = anXn + an-1Xn-1 + ... + a1X + a0 where an ≠ 0

Let Q = bmXm + bm-1Xm-1 + ... + b1X + b0 where bm ≠ 0

The resultant of P and Q is the determinant of the following (m+n) x (m+n) matrix:



For review of computing the determinant using the method of cofactor expansion, see here.

Here is the theorem justifying this construction:

Theorem: Common Roots of Two Polynomials

Let P,Q be the polynomials described in definition.

Let R be the resultant of P,Q

R = 0 if and only if P,Q have a common root

Proof:

(1) Assume that P,Q have a common root u

(2) Then (x - u) divides both P,Q and there exists P1, Q1 such that:

P = (x - u)P1

Q = (x - u)Q1

and degree of P1 is less than degree of P [See Theorem here for details]
and degree of Q1 is less than degree of Q [See Theorem here for details]

(3) We can also see that:

Q1 = Q/(x - u)
P1 = P/(x - u)

(4) So that:

PQ1 = (x - u)P1*Q/(x - u) = QP1

(5) We can see that P,Q,P1,Q1 are all polynomials and:

There exists ai such that:

P = anxn + an-1xn-1 + ... + a1x + a0

where an ≠ 0.

There exists bj such that:

Q = bmxm + bm-1xm-1 + ... + b1x + b0

where bm ≠ 0

There exists zk such that:

P1 = -(z1xn-1 + z2xn-2 + ... + zn-1x + zn)

where z1 ≠ 0

There exists yl such that:

Q1 = y1xm-1 + y2xm-2 + ... + ym-1x + ym

where y1 ≠ 0

(6) From step #4, we can see that:

PQ1 - QP1 = 0

(7) In the expression in step #6, we can see that the coefficient for xk = ∑ (i+j=k) (aiym-j) + ∑ (i+j=k) (bizn-j) since:

(a) For each term i in P, ai is the coefficient for xi

(b) For each term j in Q1, ym-j is the coefficient for xj.

Consider 1 = m - (m-1); 2 = m - (m - 2); m-1 = m - (1); m = m - (0)

(c) For PQ1, the coefficient is ∑ (i+j=k) (aiym-j)

(d) For each term i in Q, bi is the coefficient for xi

(e) For each term j in P1, -zn-j is the coefficient for xj.

Consider 1 = n - (n-1); 2 = n - (n-2); n-1 = n - (1); n = n - (0)

(f) For QP1, the coefficient is ∑ (i+j=k) (-bizn-j)

(g) For PQ1 - QP1, then, the coefficient is:

∑(i+j=k) (aiym-j + bizn-j) = ∑ (i+j=k) (aiym-j) + ∑(i+j=k) (bizn-j)

(8) We can further simplify this expression by defining s,t such that:

s = m-j
t = n-j

so that:

j = m - s = n - t

and:

i + j =k → i + m - s = k → i - s = k - m

i + j = k → i + n - t = k → i - t = k - n

which gives us:

∑ (i+j=k) (aiym-j) + ∑(i+j=k) (bizn-j) =

∑ (i - s = k - m) (aiys) + ∑ (i - t = k - n) (bizt)

(9) Now since the degree of P is n, the degree of P1 is n-1, the degree of Q is m, and the degree of Q1 is m-1, it follows that the degree of PQ1 = m+n-1 and the degree of QP1 = m+n-1.

(10) So, we can use the result in step #6 and the result in step #9 to build m + n linear equations where each linear equation represents a different value of k since the sum of coefficents for each power of x must equal 0.

for k = m + n - 1, we have:

any1x(m+n-1) + bmz1x(m+n-1) = 0

for k = m + n - 2, we have:

any2x(m+n-2) + an-1y1x(m+n-2) + bmz2x(m+n-2) + bm-1z1x(m+n-2) = 0

...

for k = 1, we have:

a1ymx1 + a0ym-1x1 + b1znx1 + b0zn-1x1 = 0

for k = 0, we have:

a0ym + b0zn = 0

(11) Factoring out xk from each of these equations, gives us:

for k = m + n - 1, we have:

any1 + bmz1 = 0

for k = m + n - 2, we have:

any2 + an-1y1 + bmz2 + bm-1z1 = 0

...

for k = 1, we have:

a1ym + a0ym-1 + b1zn + b0zn-1 = 0

for k = 0, we have:

a0ym + b0zn = 0

(12) For each of these linear equations, we can view the unknowns as consisting of ys and zt so that we get the following matrix representing a homogeneous system of linear equations:



(13) Now, it is clear that the equation above is none other than:

RX = 0

(14) We also know that there exists a nontrivial solution since from step #5 above,



(15) Since a nontrivial solution exists, we know that det R = 0 [See Theorem 6, here]

(16) Assume that det(R)=0

(17) It follows that R can be expressed as a homogeneous system of linear equations with a nontrivial solution. [See Theorem 6, here]

(18) We can label this nontrivial solution the same step #14 above:



(19) Multiplying out R with the nontrivial solution gives us the same set of m+n equations as step #11 above:

for the first equation, we have:

any1 + bmz1 = 0

for the second equation, we have:

any2 + an-1y1 + bmz2 + bm-1z1 = 0

...

for the (m+n-1)th equation, we have:

a1ym + a0ym-1 + b1zn + b0zn-1 = 0

for the (m+n)th equation, we have:

a0ym + b0zn = 0

(20) Since these are the exact same as step #11, we know that we can factor them out into P,Q,P1,Q1 just as we did in step #6 and step #7 above.

(21) To complete this proof, let's assume that P and Q are relatively prime -- that is, they don't have at least one solution in common.

(22) Using PQ1 = QP1, we conclude that P must divide P1 since it cannot divide Q (since P,Q are relatively prime).

(23) But this is impossible since P1 has a lower degree than P.

(24) Therefore, we have a contradiction and we can conclude that P and Q have a solution in common.

QED

References