The Quartic Formula using Principal Roots

The derivation of the quartic formula required a non-standard selection of square roots at one point, and I wanted a version which only used principal square roots. This can be accomplished by adding a sign factor $s=\pm1$ to the quartic formula, making it \[ x = \frac{-b\pm(y_1\pm y_2)\pm s y_3}{4a} , \] and one needs to ensure that $s$ is chosen so that $s\sqrt{y_1^2}\sqrt{y_2^2}\sqrt{y_3^2} = -b^3+4abc-8a^2d$ holds, where those are principal square roots. Using the sgn function, one can take $s=\mathop{\rm sgn}\Bigl((-b^3+4abc-8a^2d)\sqrt{y_1^2}\sqrt{y_2^2}\sqrt{y_3^2}\Bigr)$, at least when this is nonzero. However, considering that each $y_i$ is a long expression, it will be worthwhile to derive a shorter expression for $s$.

Since $y_1^2$, $y_2^2$, $y_3^2$ are the roots of a cubic with real coefficients, either all are real or one is real and the other two are complex conjugates. Without loss of generality, say $y_1^2$ is real. We also assume that all the $y_i^2$ are nonzero, since otherwise $s$ may be chosen to be $\pm1$ arbitrarily and the required identity will still hold.

Case 1: Only $y_1^2$ Real

Since $y_2^2$ and $y_3^2$ are nonreal complex conjugates, their principal square roots are also complex conjugates. Thus $\sqrt{y_2^2}\sqrt{y_3^2}$ is a positive real number. Now, $y_1^2$ cannot be negative, or else $\sqrt{y_1^2}\sqrt{y_2^2}\sqrt{y_3^2}$ would not be real. Thus $y_1^2$ is positive, so its principal square root is also a positive real number.

Therefore in this case we have that $\sqrt{y_1^2}\sqrt{y_2^2}\sqrt{y_3^2}$ is positive, and we need to take $s=\mathop{\rm sgn}(-b^3+4abc-8a^2d)$ for the required identity to hold.

Case 2: All $y_i^2$ Real

First, note that an even number of the $y_i^2$ are negative, otherwise $\sqrt{y_1^2}\sqrt{y_2^2}\sqrt{y_3^2}$ would not be real. Thus either all the $y_i^2$ are positive, or one is positive and two are negative. If all are positive, then $\sqrt{y_1^2}\sqrt{y_2^2}\sqrt{y_3^2}$ is also positive, and we need to take $s=\mathop{\rm sgn}(-b^3+4abc-8a^2d)$. If two are negative, then $\sqrt{y_1^2}\sqrt{y_2^2}\sqrt{y_3^2}$ is also negative, and we need to take $s=-\mathop{\rm sgn}(-b^3+4abc-8a^2d)$.

Computation of $s$

So $s$ should be taken to be $\mathop{\rm sgn}(-b^3+4abc-8a^2d)$ in all cases except when two $y_i^2$s are real and negative. Ideally, one would like a simple criterion to determine when exactly this happens. The following theorem will be useful for this.

Theorem. A cubic polynomial $y^3+By^2+Cy+D$ has three positive real roots if and only if $B<0$, $C>0$, $D<0$, and $(2B^3-9BC+27D)^2-4(B^2-3C)^3\leq0$. [Proof]

The final quantity is equal to $-27\Delta$, where $\Delta$ is the discriminant of the cubic. Applying this theorem to the cubic used in the quartic derivation requires the following assignments: \begin{align*} B &= -(3b^2 - 8ac) \\ C &= 3b^4 + 16a^2c^2 + 16a^2bd - 16ab^2c - 64a^3e \\ D &= -(-b^3+4abc-8a^2d)^2 \end{align*} It follows that $D\leq0$ (since it is the negative of a real number squared), and we can assume that $D\neq0$ since otherwise at least one $y_i$ is zero and the sign of $s$ is irrelevant. However, the signs of $B$, $C$, and $\Delta$ will vary. The following table shows which case we are in for all choices of the signs: \[ \begin{array}{ccccc} -27\Delta & -B & C & \text{case} & \max(-27\Delta,\min(-B,C)) \\ \hline + & + & + & \text{one $y_i^2$ real} & + \\ + & + & \leq0 & \text{one $y_i^2$ real} & + \\ + & \leq0 & + & \text{one $y_i^2$ real} & + \\ + & \leq0 & \leq0 & \text{one $y_i^2$ real} & + \\ \leq0 & + & + & \text{all $y_i^2>0$} & + \\ \leq0 & + & \leq0 & \text{two $y_i^2<0$} & \leq0 \\ \leq0 & \leq0 & + & \text{two $y_i^2<0$} & \leq0 \\ \leq0 & \leq0 & \leq0 & \text{two $y_i^2<0$} & \leq0 \end{array} \]

Note that the sign of $\max(-27\Delta,\min(-B,C))$ is a good criterion for checking when two $y_i^2$ are real and negative, as desired. Therefore we could use the factor $\mathop{\rm sgn}(\max(-27\Delta,\min(-B,C))$ in $s$, except for that when the expression is $0$ we'd like the output to be $-1$. This can be accomplished by the modified sign function $\mathop{\rm sgn}(\mathop{\rm sgn}(x)-1/2)$.

A similar trick is done with $\mathop{\rm sgn}(-b^3+4abc-8a^2d)$ to prevent $s$ from being zero when some $y_i$ is zero. Putting it all together, the choice \[ s = \mathop{\rm sgn}\Bigl(\bigl(\mathop{\rm sgn}(-b^3+4abc-8a^2d)-\tfrac12\bigr)\bigl(\mathop{\rm sgn}(\max(-27\Delta,\min(-B,C))-\tfrac12\bigr)\Bigr) \] correctly chooses the sign of the last radical when principal roots are used.