Derivation of Maxwell’s Equations via the Covariance Requirements of the Special Theory of Relativity, Starting with Newton’s Laws
Abstract: The purpose of this paper is to establish a connection between Maxwell’s equations, Newton’s laws, and the special theory of relativity. This is done with a derivation that begins with Newton’s verbal enunciation of his first two laws. Derived equations are required to be covariant, and a simplicity criterion requires that the four-vector force on a charged particle be linearly related to the four-vector velocity. The connecting tensor has derivable symmetry properties and contains the electric and magnetic field vectors. The Lorentz force law emerges, and Maxwell’s equations for free space emerge with the assumption that the tensor and its dual must both satisfy first-order partial differential equations. The inhomogeneous extension yields a charge density and a current density as being the source of the field, and yields the law of conservation of charge. Newton’s third law is reinterpreted as a reciprocity statement, which requires that the charge in the source term can be taken as the same physical entity as that of the test particle and that both can be assigned the same units. Requiring covariance under either spatial inversions or time reversals precludes magnetic charge being a source of electromagnetic fields that exert forces on electric charges.

1. Introduction

Maxwell’s equations and the special theory of relativity are intimately related, and a rich literature exists that explores and elucidates this connection. One major theme is that one can start with the basic ideas of the theory of special relativity and, with some basic experimental laws and with a small number of intuitively simple assumptions, derive Maxwell’s equations. Another theme is that one can start with two or more relatively simple physical laws, not explicitly relying on special relativity, and once again derive Maxwell’s equations.

The principal theme of the present paper is that there is a strong connection between Newton’s laws and Maxwell’s equations, and that this connection is provided by the special theory of relativity. The viewpoint here is wholly classical, although not mechanistic. The electrical and magnetic fields are not regarded as mechanical entities, but as classical fields which are created by moving charges and which exert forces on moving charges. A suitable reinterpretation of Newton’s laws is assumed to apply to the masses that experience forces caused by the fields, and the fields are assumed to be governed by an independent set of equations. Use is made of Einstein’s two postulates    that 1) the equations must have the same form in equivalent coordinate systems and that 2) the speed of light must be the same in all such coordinate systems. The concept of covariance is applied to the sense of Minkowski  , so that equivalent coordinate systems are taken to be those where the coordinates of one are related to those of another by a Lorentz transformation.

In regard to Newton’s laws being used as a starting point, the idea of such goes at least as far back as 1948, when Feynman  showed Dyson a “proof” of Maxwell’s equations “assuming only Newton’s laws of motion and the commutation relation between position and velocity for a single non-relativistic particle”. That proof as reported by Dyson is perplexing, as it is difficult to see how a set of equations that predict propagation at a speed with a precise unique value should result from a formulation that does not explicitly involve the speed c of light. Feynman was using units in which c was numerically equal to unity, and the details of his thinking are encoded in the remark that the “other two Maxwell equations merely define the external charge and current densities”. The treatment in the present paper follows more traditional lines of thinking and is limited entirely to the realm of classical physics, given the normally accepted inclusion of the special theory of relativity in classical physics. Nevertheless, as is discussed further below, Feynman’s provocative remark supplies a crucial hint as to what should be an appropriate reinterpretation of Newton’s third law.

In regard to the more traditional treatments appearing in previous literature, one should first note that Maxwell’s equations preceded the special theory of relativity in the history of physics, and one might loosely state that relativity developed because of the need to insure that Maxwell’s equations be independent of any relative velocity between coordinate systems. But, after the emergence of relativity as a fundamental cornerstone of physics, papers and books began to appear that “derived” Maxwell’s equations. A major category of such treatments takes Coulomb’s law, or equivalently, the “laws of electrostatics” as a starting point. The earliest such derivation was given by Page  in 1912, and that treatment was subsequently refined in the 1940 textbook by Page and Adams  . Frisch and Wilets  in a 1956 paper criticize Page and Adams, stating that they use “an apparently overspecialized model: an emission theory of lines of force”, and give an alternate derivation, making a series of plausible (but not manifestly obvious) postulates, which can be construed as including Coulomb’s law. They also bring attention to a 1926 paper by Swann  where two derivations, involving the use of invariance under Lorentz transformations, of equations resembling Maxwell’s equations are given.

More recent treatments making use of Coulomb’s law were given by Elliott  and Tessman  in 1966, and a brief pedagogical development was given by Krefetz  in 1970, who drew attention to Feynman’s remark  , “it is sometimes said, by people who are careless, that all of electrodynamics can be deduced solely from the Lorentz transformation and Coulomb’s law”, which is followed by statements to the effect that it is always necessary to make some additional assumptions. Krefetz pointed out that “what constitutes a reasonable assumption is, after all, a matter of taste”.

Another theme for the derivation of Maxwell’s equations can be traced back to Landau in 1933. Podolsky, in the preface of his text with Kunz  , refers to discussions he had with L. D. Landau in 1933 on the goal of “presenting classical electrodynamics as theory based on definite postulates of a general nature, such as the principle of superposition, rather than [inductively inferring the theory from] experimental laws”. Thus, in the venerable Landau and Lifshitz series  , one finds an elegant and extensive development, which begins with the assumption of the existence of a four-potential, which presumes the validity of the principle of least action (Hamilton’s principle) in which the time and spatially varying potentials are treated as generalized coordinates, and which makes a series of plausible assumptions concerning the form of the action function. The treatment is an intricate blend of sophisticated mathematical constructions of theoretical physics and plausible assumptions, although in a footnote the authors state: “The assertions which follow should be regarded as being, to a certain extent, the consequence of experimental data. The form of the action for a particle in an electromagnetic field cannot be fixed on the basis of general considerations alone”.

It would unduly lengthen the present paper if one attempted to discuss, even in a cursory manner, all the papers and book passages that have been concerned with the derivation of Maxwell’s equations, and the present author cannot claim to have seen all those that are currently available, let alone digested them. Among those that should be mentioned are a sequence of papers by Kobe     which examine the topic from a variety of perspectives and which also give extensive references. Other papers of interest are those by Crater  , Jefimenko  , Ton  , Griffiths and Heald  , Crawford  , Neuenschwander and Turner  , Bork  , Goedecke  , and Hokkyo  .

The manner in which the present paper’s development differs from what has appeared previously in the literature is addressed more fully further below and in the concluding remarks section.

2. Relativistic Version of Newton’s Second Law

The discussion here begins, interlacing a brief summary of some basic tenets of the special theory of relativity, with a concise derivation of the covariant form, first given by Minkowski  , of Newton’s second law. The derivation differs from what has previously been published in that it specifically draws on Newton’s verbal enunciation  of his first two laws.

One seeks a description for the evolution of the space-time coordinates of a test particle in a “Minkowski” space    with (world point) coordinates ${X}^{1}=x$ , ${X}^{2}=y$ , ${X}^{3}=z$ , and ${X}^{4}=ct$ , where c is the speed of light. The coordinates of the particle itself are distinguished by a subscript P (for particle). Whatever equations are derived are required to be the same in any one of an equivalent set of coordinate systems, these being such that the speed of light is the same in each such system.

Suppose, for example, that $\Delta {X}^{\alpha }$ is a set of coordinate increments in one admissible coordinate system, with the spatial separation equal to c times the time separation, so that

$-{\left(\Delta {X}^{1}\right)}^{2}-{\left(\Delta {X}^{2}\right)}^{2}-{\left(\Delta {X}^{3}\right)}^{2}+{\left(\Delta {X}^{4}\right)}^{2}=\Delta {X}^{\alpha }{g}_{\alpha \beta }\Delta {X}^{\beta }=0.$ (1)

(The second version here makes use of common tensor notation   , with ${g}_{\alpha \beta }$ being the metric tensor, a diagonal matrix with diagonal elements −1, −1, −1, and +1). Then an analogous relation must hold for a second coordinate system, so that

$-{\left(\Delta {Y}^{1}\right)}^{2}-{\left(\Delta {Y}^{2}\right)}^{2}-{\left(\Delta {Y}^{3}\right)}^{2}+{\left(\Delta {Y}^{4}\right)}^{2}=\Delta {Y}^{\alpha }{g}_{\alpha \beta }\Delta {Y}^{\beta }=0.$ (2)

Admissible transformations that connect two such coordinate systems are taken to be linear relations, so that one can write, for an arbitrary set of increments (T for transformed),

$\Delta {X}_{T}^{\alpha }=\Delta {Y}^{\alpha }={\Lambda }_{\beta }^{\alpha }\Delta {X}^{\beta },$ (3)

where the transformation matrix ${\Lambda }_{\beta }^{\alpha }$ is independent of the coordinates. Given this relation, a brief derivation shows that Equation (2) follows from Equation (1) provided the transformation matrix satisfies the relation  

${\Lambda }_{\alpha }^{\gamma }{g}_{\gamma \delta }{\Lambda }_{\beta }^{\delta }={g}_{\alpha \beta }.$ (4)

There is a wider   class of transformations that leaves the speed of light unchanged, but the class represented by the above provides sufficient guidance for identification of a covariant theory. The transformations allowed by this relation include rigid body rotations, time reversals, spatial inversions, Lorentz’s and Einstein’s transformation between moving coordinate systems, and any arbitrary sequence of these. Following Poincare    , such transformations are here referred to as Lorentz transformations, and they form a group. The determinant of any matrix satisfying Equation (4) can be either +1 or −1, and one can also show  that if ${\Lambda }_{4}^{4}>0$ for each of two consecutive transformations, then this is so for the combined transformation. The subgroup for which the determinant is +1 and for which ${\Lambda }_{4}^{4}>0$ is known as the proper orthochronous Lorentz subgroup. The full group includes this subgroup plus the time-reversal transformation (diagonal with elements 1, 1, 1, and −1) and the spatial inversion transformation (diagonal with elements −1, −1, −1, and +1) and all of the products formed from the subgroup and these two operators. One can also conceive of subgroups which exclude the time-reversal operator and which exclude the spatial-inversion operator. This feature of the full Lorentz group leaves open the question of whether the equations of classical physics should be invariant under time reversals or under spatial inversions. The mathematical structure allows, in regard to the inclusion of these two operators, the following choices: (i) neither, (ii) only one of the two, (iii) both, but only if simultaneous, and (iv) both, regardless of whether simultaneous or individual.

The defining property, Equation (4), of the Lorentz transformation allows a definition  of a proper time ${\tau }_{P}$ for a point particle that is an invariant with respect to the orthochronous proper subgroup and with respect to spatial inversions, but which changes sign under time reversals. The trajectory of the test particle is described in parametric form with the space-time coordinates ${X}_{P}^{\alpha }$ all regarded to be functions of a parameter ${\tau }_{P}$ , defined so that time is a monotonically increasing function of ${\tau }_{P}$ , and so that increments of ${\tau }_{P}$ can be computed from

${c}^{2}{\left(\text{d}{\tau }_{P}\right)}^{2}={c}^{2}{\left(\text{d}{t}_{P}\right)}^{2}-{\left(\text{d}{x}_{P}\right)}^{2}-{\left(\text{d}{y}_{P}\right)}^{2}-{\left(\text{d}{z}_{P}\right)}^{2}=\text{d}{X}_{P}^{\alpha }{g}_{\alpha \beta }\text{d}{X}_{P}^{\beta }.$ (5)

The form of the second expression justifies the assertion that ${\left(\text{d}{\tau }_{P}\right)}^{2}$ is fully invariant. Equivalently, since $\text{d}{x}_{P}={v}_{P}\text{d}{t}_{P}$ , one can express the relation between ${t}_{P}$ and ${\tau }_{P}$ in any given coordinate system as

$\text{d}{\tau }_{P}=\left(1/{\kappa }_{P}\right)\text{d}{t}_{P};\text{ }{\kappa }_{P}={\left[1-{\left({v}_{P}/c\right)}^{2}\right]}^{-1/2}$ , (6)

where ${v}_{P}$ is the particle’s velocity.

The chief feature of this proper time is that it allows one to identify a tentatively suitable four-vector counterpart of the particle velocity as ${U}_{P}^{\alpha }=\text{d}{X}_{P}^{\alpha }/\text{d}{\tau }_{P}$ . Given the stated transformation properties of the differential $\text{d}{\tau }_{P}$ , this four-vector transforms under the full Lorentz group as

${U}_{P,T}^{\alpha }=\text{sign}\left({\Lambda }_{4}^{4}\right){\Lambda }_{\beta }^{\alpha }{U}_{P}^{\beta }.$ (7)

Here the indicated “sign” operator yields 1 if the argument is positive and −1 if it is neagative Because of the sign-factor, one would say that ${U}_{P}^{\alpha }$ is not a genuine four-vector, but some sort of pseudo-four-vector (It is referred to in what follows as a pseudo-four-vector of the time-reversal kind). Nevertheless, as long as one knows its transformation rule and formulates equations consistent with this, it can be used in a covariant formulation. In particular, one may note that its derivative with respect to proper time is a genuine four-vector.

The foregoing provides sufficient mathematical structure for the covariant interpretation of Newton’s first two laws. These, as originally enunciated (after translation to English) by Newton  , are as follows:

Law 1. Every body perseveres in its state of being at rest or of moving uniformly straight forward, except insofar as it is compelled to change its state by forces impressed.

Law 2. A change in motion is proportional to the motive force impressed and takes place along the straight line in which that force is impressed.

For the covariant interpretation of these, one makes use of what is available that transforms properly under Lorentz transformations. The covariant statement of the first law is that ${U}_{P}^{\alpha }$ must be a constant if there are no forces. The covariant expression for “change in motion” is $\text{d}{U}_{P}^{\alpha }/\text{d}\tau$ , and the proportionality constant must be what is ordinarily termed the “rest mass” ( ${m}_{o}$ ). The phrase “along a straight line in which the force is impressed” has to be loosely interpreted as saying that there is a contravariant vector termed “force” which is “parallel” to the “change in motion” four-vector. Thus one arrives at the relation

${m}_{o}\frac{\text{d}{U}_{P}^{\alpha }}{\text{d}{\tau }_{P}}={R}_{P}^{\alpha },$ (8)

where the right side, the four-vector force (Minkowski force), is a contravariant vector with as-yet-undefined components ${R}_{P}^{\alpha }$ . It should be a genuine four-vector and transform under the full Lorentz group as in Equation (3).

3. The Lorentz Force Law

The four-vector force of interest here is that which is associated with an electromagnetic field. A primary assumption is that there is an “external” part of this field which exists independently of the presence of the test particle. Whatever characterizes this field depends only on the four space-time (Minkowski) coordinates and is independent of the particle’s mass and velocity. However, the force exerted by this field may well depend on the particle’s velocity. Also, one assumes that the particle has an additional scalar property, a charge ${q}_{P}$ , which is defined so that all the force components exerted by the electromagnetic field on the particle are directly proportional to ${q}_{P}$ . For this particle, which has no other intrinsic structure, the four-vector velocity (Minkowski velocity) ${U}_{P}^{\alpha }$ is the only simple tensor that one has available for the formulation of a covariant expression for the four-vector force. One can argue that there is some weak limit, which probably has very wide applicability, where the four-vector force is linearly related to the four-vector velocity. Thus one is led to the plausible postulate (sort of a Hooke’s law of electromagnetism) that

${R}_{P}^{\alpha }={q}_{P}{\Phi }_{\beta }^{\alpha }{U}_{P}^{\beta },$ (9)

where covariance requires the entity ${\Phi }_{\beta }^{\alpha }$ be a tensor-like quantity (one contravariant index and one covariant index) that transforms appropriately under all Lorentz transformations. (The mathematical apparatus of tensor analysis is used here, with superscripted indices referred to as contravariant indices and subscripted indices referred to as covariant indices, and with the metric tensor ${g}_{\alpha \beta }$ available for the lowering of indices).

The actual transformation rule, for the purely contravariant form, as deduced from Equations (3) and (7), is

${\Phi }_{T}^{\alpha \beta }=\text{sign}\left({\Lambda }_{4}^{4}\right){\Lambda }_{\mu }^{\alpha }{\Lambda }_{\nu }^{\beta }{\Phi }^{\mu \nu },$ (10)

where the sign-factor is −1 for transformations that involve time-reversals, and +1 for those that do not. Thus, ${\Phi }_{\beta }^{\alpha }$ is, strictly speaking, not a tensor with regard to the full Lorentz group, but some sort of pseudo-tensor. To distinguish this type of pseudo-tensor from other types that appear further below, it is referred to as a pseudo-tensor of the time-reversal kind.

The identification in Equation (9) is not unique, and there are many possibilities, such as a quadratic expression of the form ${q}_{P}{\Phi }_{\beta \gamma }^{\alpha }{U}_{P}^{\beta }{U}_{P}^{\gamma }$ , where ${\Phi }_{\beta \gamma }^{\alpha }$ is some as yet undetermined tensor with three, rather than two, indices. But the expression in Equation (9) is the simplest of all such expressions.

Equation (9) is, of course, well-known, but existing discussions in the literature usually arrive at it after the electromagnetic field tensor ${\Phi }_{\beta }^{\alpha }$ has been previously arrived at by other means. Low  , for example, infers the electromagnetic field tensor first, with reference to experimental laws, and then argues that the four-vector force must be linear in the electromagnetic field tensor, and then argues that the only plausible covariant expression has to be of the form of Equation (9). In retrospect, this is very satisfying to one’s intuition, but the argument is not available in the present context, as one is assuming that one knows nothing about the electromagnetic field tensor at this point, other than that it is a tensor that adheres to a definite transformation law. (In what follows, the term “tensor” is used loosely to refer to both pseudo-tensors and genuine tensors).

Since the tensor ${\Phi }_{\beta }^{\alpha }$ exists independently of the presence of the charge, it is a continuum field. Each component is independent of any parameters characterizing the particle, but each depends on the space-time coordinates. The presence of other bodies that affect the motion of the test particle is presumed to be fully accounted for by the properties of the field, and such other bodies are regarded as sources of the field. (Note that the four-vector ${U}_{P}^{\beta }$ can never be identically zero, as there is always a fourth non-zero component. There is no contradiction here with the expectation that a body at rest can experience an electromagnetic force).

The remaining arbitrariness in the tensor ${\Phi }_{\beta }^{\alpha }$ is drastically reduced by a derivable orthogonality condition between the four-vector velocity and the four-vector force,

${R}_{P}^{\alpha }{U}_{P,\alpha }={R}_{P}^{\alpha }{g}_{\alpha \beta }{U}_{P}^{\beta }=0,$ (11)

which was first noticed by Minkowski  . To derive this, one multiplies Equation (9) by ${U}_{P,\alpha }$ , performs the implied sum, and recognizes that ${U}_{P,\alpha }\text{d}{U}_{P}^{\alpha }/\text{d}{\tau }_{P}$ is $\left(1/2\right)\text{d}\left({U}_{P}^{\alpha }{U}_{P,\alpha }\right)/\text{d}{\tau }_{P}$ . But the sum ${U}_{P}^{\alpha }{U}_{P,\alpha }$ is a scalar, a constant equal to ${c}^{2}$ , so its derivative is zero, and Equation (11) results.

The implication of the orthogonality relation in regard to Equation (9) is that

${U}_{P,\alpha }{\Phi }^{\alpha \beta }{U}_{P,\beta }=0,$ (12)

where ${\Phi }^{\alpha \beta }={\Phi }_{\gamma }^{\alpha }{g}^{\gamma \beta }$ is the purely contravariant form (two contravariant indices) of the mixed tensor ${\Phi }_{\beta }^{\alpha }$ . Although the magnitudes of the components of ${U}_{P,\alpha }$ are constrained so that the inner product of its contravariant and covariant forms is ${c}^{2}$ , they are otherwise arbitrary, and the above equation must hold for all such vectors. The admissible arbitrariness leads to the deduction that the purely contravariant form of the electromagnetic tensor is antisymmetric, so that

${\Phi }^{\alpha \beta }=-{\Phi }^{\beta \alpha }.$ (13)

[The proof just given can be discerned, although in a somewhat different context, in the text by Melvin Schwartz  and in a 1986 paper by Kobe  . The relevant passage in Schwartz’s text is in a footnote, with the development appearing there attributed to D. Dorfan. Kobe gives an explicit derivation of Equation (13) from Equation (12). The derivation is also given in the paper by Neuenschwander and Turner  ].

With the aid of some hindsight regarding the symbols that one uses to label the off-diagonal elements, the (antisymmetric) matrix representation (second index corresponding to columns) can be written with all generality as

$\left[{\Phi }^{\alpha \beta }\right]=\left(\begin{array}{cccc}0& -{B}_{z}& {B}_{y}& {E}_{x}/c\\ {B}_{z}& 0& -{B}_{x}& {E}_{y}/c\\ -{B}_{y}& {B}_{x}& 0& {E}_{z}/c\\ -{E}_{x}/c& -{E}_{y}/c& -{E}_{z}/c& 0\end{array}\right).$ (14)

Then, since ${\Phi }_{\beta }^{\alpha }={\Phi }^{\alpha \gamma }{g}_{\gamma \beta }$ , the postulated force relation of Equation (9) becomes

$\left(\begin{array}{c}{f}_{P,x}\\ {f}_{P,y}\\ {f}_{P,z}\\ {v}_{P}\cdot {f}_{P}/c\end{array}\right)={q}_{P}\left(\begin{array}{cccc}0& {B}_{z}& -{B}_{y}& {E}_{x}/c\\ -{B}_{z}& 0& {B}_{x}& {E}_{y}/c\\ {B}_{y}& -{B}_{x}& 0& {E}_{z}/c\\ {E}_{x}/c& {E}_{y}/c& {E}_{z}/c& 0\end{array}\right)\left(\begin{array}{c}{v}_{P,x}\\ {v}_{P,y}\\ {v}_{P,z}\\ c\end{array}\right).$ (15)

Here ${f}_{P}$ is the three-vector force which appears when the first three components of Equation (9) are written out explicitly in vector notation as

$\frac{\text{d}{p}_{P}}{\text{d}t}={f}_{P},$ (16)

with the momentum defined as

${p}_{P}=\frac{{m}_{o}}{{\left[1-\left({v}_{P}^{2}/{c}^{2}\right)\right]}^{1/2}}{v}_{P}={m}_{P}^{*}{v}_{P}.$ (17)

(Here ${m}^{*}$ is the relativistic mass). Also, in Equation (15), one has identified ${R}_{P}^{1}={\kappa }_{P}{f}_{P,x}$ with analogous relations for the 2nd and 3rd components. The fourth component, ${R}_{P}^{4}$ , is derived from the orthogonality relation of Equation (11).

The first three components of Equation (15) yield, with vector notation,

${f}_{P}={q}_{P}\left(E+{v}_{P}×B\right),$ (18)

which is the Lorentz force law   . The fourth component equation is then only a redundant corollary of the first three, because ${v}_{P}\cdot \left({v}_{P}×B\right)=0$ . Although one might not expect anything otherwise, it may be surprising to some that the Lorentz force equation is a direct consequence of the orthogonality of the four-vector velocity ${U}_{P}^{\alpha }$ and the four-vector force ${R}_{P}^{\alpha }$ . [A derivation of the Lorentz force equation via the special theory of relativity was apparently first given by Tolman,  where he used the previously derived Lorentz transformation laws of the electromagnetic field components to infer the force law in a system where the charge was moving with speed $v$ from the force law in a system where the charge was momentarily stationary. In retrospect, the derivation here is equivalent to that of Tolman, only the requirement of covariance enables one to bypass using the explicit form of any Lorentz transformation].

With reference to the transformation rules in Equations (7) and (10), one deduces that, under pure time-reversals,

${v}_{P}\to -{v}_{P};\text{\hspace{0.17em}}\text{\hspace{0.17em}}E\to E;\text{\hspace{0.17em}}\text{\hspace{0.17em}}B\to -B;\text{\hspace{0.17em}}\text{\hspace{0.17em}}{f}_{P}\to {f}_{P}\text{\hspace{0.17em}}\left(\text{time-reversals}\right),$ (19)

while under pure spatial-inversions

${v}_{P}\to -{v}_{P};\text{\hspace{0.17em}}\text{\hspace{0.17em}}E\to -E;\text{\hspace{0.17em}}\text{\hspace{0.17em}}B\to B;\text{\hspace{0.17em}}\text{\hspace{0.17em}}{f}_{P}\to -{f}_{P}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\text{ }\left(\text{spatial-inversions}\right),$ (20)

so, given these rules, the Lorentz force relation is fully covariant under time-reversals and spatial-inversions.

[The symbols assigned to the matrix elements in Equation (14) are appropriate for SI (rationalized MKS) units, where the components of the vector $E$ have the units of volts per meter, or newtons per coulomb, and where the components of $B$ have the units of teslas, or webers per square meter, or newton-seconds per coulomb-meter. In the commonly used Gaussian system of units, distances have units of centimeters, forces have units of dynes, and charge has the units of statcoulombs. The electric field is denoted by $E$ and has units of dynes per statcoulomb, and the magnetic field $B$ has the units of gauss. The unit of charge, the statcoulomb, is defined so that the gauss and the dynes per statcoulomb have the same units, and this requires $B/c$ to replace the SI magnetic field $B$ in the Lorentz force equation, so that for Gaussian units Equation (18) is replaced by

${f}_{P}={q}_{P}\left(E+\frac{{v}_{P}}{c}×B\right).$ (21)

To render Equation (14) appropriate for Gaussian units, one need only replace the quantities ${B}_{x}$ , ${B}_{y}$ , and ${B}_{z}$ , wherever they appear, by ${B}_{x}/c$ , ${B}_{y}/c$ , and ${B}_{z}/c$ ].

4. Dual Nature of Electric and Magnetic Fields

At this point, it is appropriate to interject a remark from an autobiographical essay  written by Einstein relatively late in his life.

[Maxwell’s equations] can be grasped formally in satisfactory fashion only by way of the special theory of relatively. [They] are the simplest Lorentz-invariant field equations which can be postulated for an anti-symmetric tensor derived from a vector field.

While Einstein in this essay does not specify to just which antisymmetric tensor he is referring, one can infer from the development   in his 1916 Grundlagen paper [Section 20, Equations (59)-(61)] that the tensor is essentially the same as the tensor ${\Phi }^{\alpha \beta }$ that appears here in the four-vector force equation, Equation (9). [The actual relation, when Gaussian units are used, is that Einstein’s ${F}_{\alpha \beta }$ is $-c{\Phi }_{\alpha \beta }$ , where the indicated tensors are the purely covariant forms]. The vector field to which Einstein refers is the four-vector potential. In what follows, this same tensor is brought into play, and the development manages to sidestep any assumption that it has to be derived from a four-vector field.

The premise here is that the tensor ${\Phi }^{\alpha \beta }$ identified in the previous section is a natural building block for a covariant formulation of a set of partial differential equations (Einstein’s simplest Lorentz-invariant field equations) that govern the time and spatial evolution of the individual elements of the tensor. This tensor has six nonzero elements that are possibly different from each other, so the description of its evolution requires at least six equations. One might anticipate at first that this tensor can yield at most only four equations, which would be insufficient for a complete formulation. However, the antisymmetry of ${\Phi }^{\alpha \beta }$ and the specific property in Equation (4) of the Lorentz transformation allows one to identify  a “dual tensor” as

${\Psi }^{\alpha \beta }=\frac{1}{2}{d}^{\alpha \beta \mu \nu }{\Phi }_{\mu \nu },$ (22)

which can be used for the derivation of additional equations. Here the symbol ${d}^{\alpha \beta \mu \nu }$ is defined to be zero if any two of its indices are numerically equal, and to be unity (+1) if the ordered set of numbers $\alpha$ , $\beta$ , $\mu$ , $\nu$ is an even permutation of the integers 1, 2, 3, 4. If the permutation is odd, then the value is −1. [This symbol is occasionally, with various notations, referred to as the Levi-Civita symbol   and also as a permutation symbol  ].

This definition in Equation (22), in conjunction with Equation (14), leads to an entity which has the matrix representation (columns labeled by second index)

$\left[{\Psi }^{\alpha \beta }\right]=\left(\begin{array}{cccc}0& -{E}_{z}/c& {E}_{y}/c& -{B}_{x}\\ {E}_{z}/c& 0& -{E}_{x}/c& -{B}_{y}\\ -{E}_{y}/c& {E}_{x}/c& 0& -{B}_{z}\\ {B}_{x}& {B}_{y}& {B}_{z}& 0\end{array}\right).$ (23)

The relationship of Equation (22) is accordingly equivalent to the substitutions

$B\to E/c;\text{\hspace{0.17em}}\text{\hspace{0.17em}}E/c\to -B.$ (24)

Alternately, one can produce $\left[{\Phi }^{\alpha \beta }\right]$ from $\left[{\Psi }^{\alpha \beta }\right]$ with the reverse of these substitutions.

To determine the tensorial nature of the entity ${\Psi }^{\alpha \beta }$ , one first notes that a standard method  for calculation of a determinant yields (with “det” denoting the operation of taking the determinant of a matrix)

$\text{det}\left[\Lambda \right]={d}^{\alpha \beta \mu \nu }{\Lambda }_{\alpha }^{1}{\Lambda }_{\beta }^{2}{\Lambda }_{\mu }^{3}{\Lambda }_{\nu }^{4},$ (25)

where ${\Lambda }_{\alpha }^{\beta }$ is a given matrix’s element in the β-th row and α-th column. Moreover, since the sign of a determinant changes when any two rows are interchanged or when any two columns are interchanged, and since it is zero when any two are the same, one has

${d}^{\alpha \beta \mu \nu }{\Lambda }_{\alpha }^{{\alpha }^{\prime }}{\Lambda }_{\beta }^{{\beta }^{\prime }}{\Lambda }_{\mu }^{{\mu }^{\prime }}{\Lambda }_{\nu }^{{\nu }^{\prime }}={d}^{{\alpha }^{\prime }{\beta }^{\prime }{\mu }^{\prime }{\nu }^{\prime }}\text{det}\left[\Lambda \right].$ (26)

Because this applies to all matrices, it applies to any matrix that corresponds to a Lorentz transformation. Also, because one wishes the definition of the dual in Equation (22) to be applicable in all equivalent coordinate systems, the entity ${d}^{\alpha \beta \mu \nu }$ , if regarded as something analogous to a tensor, is required to be the same in all coordinate systems, so that, with a properly devised transformation rule, ${d}_{T}^{\alpha \beta \mu \nu }={d}^{\alpha \beta \mu \nu }$ . While this might in itself be taken as the transformation rule, it is helpful in the derivation of transformation rules for cases such as that of Equation (22) to express this in a manner analogous to that for a tensor with four contravariant indices. Such a transformation rule, as deduced from Equation (26), is

${d}_{T}^{{\alpha }^{\prime }{\beta }^{\prime }{\mu }^{\prime }{\nu }^{\prime }}={\left(\text{det}\left[\Lambda \right]\right)}^{-1}{\Lambda }_{\alpha }^{{\alpha }^{\prime }}{\Lambda }_{\beta }^{{\beta }^{\prime }}{\Lambda }_{\mu }^{{\mu }^{\prime }}{\Lambda }_{\nu }^{{\nu }^{\prime }}{d}^{\alpha \beta \mu \nu }.$ (27)

For the special case when the determinant is +1, this is the same as the transformation rule for a genuine tensor with four contravariant indices. However, the defining property, Equation (4), of the Lorentz transformation only requires the square of the determinant to be +1, so the determinant can be +1 or −1. If the determinant is −1, then the transformation rule in Equation (27) differs from that of a genuine tensor by a change in sign. The literature of tensor analysis refers to any entity that obeys such a rule as a pseudo-tensor, or a tensor density  . Here, to distinguish such from other types of pseudo-tensors, it is referred to as a pseudo-tensor of the standard kind.

Since ${d}^{\alpha \beta \mu \nu }$ is a pseudo-tensor of one kind, while ${\Phi }_{\mu \nu }$ is a pseudo-tensor of another kind, it follows from the basic rules of tensor calculus that their tensorial summed product in Equation (22) must be a psuedo-tensor of yet another kind where the pseudo-tensor coefficient is the product of $\text{sign}\left({\Lambda }_{4}^{4}\right)$ (plus 1 if positive and −1 if negative) and the sign of the determinant. Thus ${\Psi }^{\alpha \beta }$ obeys the transformation rule

${\Psi }_{T}^{{\alpha }^{\prime }{\beta }^{\prime }}=\text{sign}\left({\Lambda }_{4}^{4}\right)\left(\text{det}\left[\Lambda \right]\right){\Lambda }_{\alpha }^{{\alpha }^{\prime }}{\Lambda }_{\beta }^{{\beta }^{\prime }}{\Psi }^{\alpha \beta }.$ (28)

Under time-reversals and under all Lorentz transformations of the orthochronous proper subgroup, ${\Psi }^{\alpha \beta }$ transforms as a genuine tensor, since the quantity $\text{sign}\left({\Lambda }_{4}^{4}\right)\left(\text{det}\left[\Lambda \right]\right)$ in such instances is +1. That factor is −1, however, for pure spatial-inversions. Consequently, a tensor that satisfies the rule in Equation (28) is here referred to as a pseudo-tensor of the spatial-inversion kind.

Thus, the development leads to two pseudo-tensors, one which doesn’t transform properly under time-reversals, and the other which doesn’t transform properly under spatial-inversions. In principle, there is no reason why pseudo-tensors cannot be used equally as well as genuine tensors in developing a covariant formulation. One must, of course, adhere to the rules of tensor calculus for categorizing tensorial products of pseudo-tensors (or genuine tensors) of different kinds. A sum of a pseudo-tensor and a genuine tensor is disallowed, and so also a sum of two pseudo-tensors of different kinds. Equating any kind of pseudo-tensor to the corresponding null tensor would be covariant, since a null tensor can be regarded as being whatever kind of pseudo-tensor one wishes it to be.

5. Maxwell’s Equations in Free Space

One now asks what determines, within the context of a given admissible coordinate system, the time evolution of the electromagnetic fields. At a given point in space, the simplest assumption is that the momentary change in time of such fields depends only on the present values of those fields in the immediate region of the point if there are no sources nearby. Change is expected to result from imbalances, so spatial gradients are relevant. One assumes, in the absence of any other knowledge, that space has no intrinsic property, other than the speed of light c, so one seeks a covariant formulation introducing no further constants. All this suggests that one seek first order partial differential equations, which, whatever they may be, are expressible in covariant form. The ordered set of derivative operators $\partial /\partial {X}^{\alpha }$ transforms in the same manner as does a covariant vector, so the natural candidates for a covariant formulation of a set of first-order partial equations are the equations:

$\frac{\partial }{\partial {X}^{\alpha }}{\Phi }^{\alpha \beta }=0;\text{ }\frac{\partial }{\partial {X}^{\alpha }}{\Psi }^{\alpha \beta }=0.$ (29)

In both cases, one can show explicitly from the transformation rules that, if the left side is identically zero in the reference coordinate system, then it is also zero in any equivalent coordinate system (related to the first by any Lorentz transformation of the full Lorentz group). That the quantities ${\Phi }^{\alpha \beta }$ and ${\Psi }^{\alpha \beta }$ are pseudo-tensors rather than genuine tensors is of no import, because the right sides can be regarded as null tensors of the same kind.

These, when written out explicitly, yield

$\nabla ×B-\frac{1}{{c}^{2}}\frac{\partial E}{\partial t}=0;\text{ }\nabla \cdot E=0;$ (30)

$\nabla ×E+\frac{\partial B}{\partial t}=0;\text{ }\nabla \cdot B=0,$ (31)

and these are recognized as Maxwell’s equations (in rationalized MKS or SI units) in free space with the absence of sources.

6. Maxwell’s Equations with Source Terms

The generalization of Equation (29) to allow for the presence of localized sources is achieved by putting terms that are possibly nonzero on the right sides, so that these become:

$\frac{\partial }{\partial {X}^{\alpha }}{\Phi }^{\alpha \beta }={G}^{\beta };\text{ }\frac{\partial }{\partial {X}^{\alpha }}{\Psi }^{\alpha \beta }={H}^{\beta }.$ (32)

Here to achieve covariance, the right sides must transform as the appropriate kinds of pseudo-vectors:

${G}_{T}^{\beta }=\text{sign}\left({\Lambda }_{4}^{4}\right){\Lambda }_{\gamma }^{\beta }{G}^{\gamma },$ (33)

${H}_{T}^{\beta }=\text{sign}\left({\Lambda }_{4}^{4}\right)\left(\text{det}\left[\Lambda \right]\right){\Lambda }_{\gamma }^{\beta }{H}^{\gamma }.$ (34)

where the sign operator is +1 if the argument is positive and −1 if it is negative.

With some hindsight, the symbols depicting the components of the source four-vector ${G}^{\alpha }$ are here selected to be

${G}^{1}={\mu }_{o}{j}_{x},\text{\hspace{0.17em}}\text{\hspace{0.17em}}{G}^{2}={\mu }_{o}{j}_{y},\text{\hspace{0.17em}}\text{\hspace{0.17em}}{G}^{3}={\mu }_{o}{j}_{z},\text{\hspace{0.17em}}\text{\hspace{0.17em}}{G}^{4}={\mu }_{o}c\rho$ . (35)

Tentatively, $\rho$ corresponds to charge per unit volume and the ${j}_{i}$ correspond to the Cartesian components of a charge flux vector. Note that, in regard to the transformation rule of Equation (33), the components of this pseudo-vector transform under time-reversals or spatial inversions as

$j\to -j,\text{\hspace{0.17em}}\text{\hspace{0.17em}}\rho \to \rho ,\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left(\text{time-reversals}\text{\hspace{0.17em}}\text{or}\text{\hspace{0.17em}}\text{spatial-inversions}\right).$ (36)

In contrast, were the components of the pseudo-four-vector ${H}^{\alpha }$ to be written down in a comparable form, with perhaps some constant different than ${\mu }_{o}$ , and with a “magnetic charge” flux vector ${j}_{m}$ and a “magnetic charge” density ${\rho }_{m}$ , the corresponding transformation rules, in accord with Equation (34), are

${j}_{m}\to {j}_{m},\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\rho }_{m}\to -{\rho }_{m},\text{\hspace{0.17em}}\text{\hspace{0.17em}}\left(\text{time-reversals}\text{\hspace{0.17em}}\text{or}\text{\hspace{0.17em}}\text{spatial-inversions}\right).$ (37)

The latter, however, presents philosophical problems. If magnetic charge is to be an ingredient of Maxwell’s equations then it should be an invariant, and ${\rho }_{m}$ should transform into itself under time-reversals and spatial-inversions. If the equations are not required to be covariant under time-reversals and spatial-inversions, then magnetic charge can be considered an invariant. But if they are to be covariant under either, not necessarily both, then magnetic charge has to left out of Maxwell’s equations. [There is, however, a possibility that the equations can be invariant under these transformations, but only when they are applied simultaneously, yielding a total inversion. Such corresponds to the proper subgroup of Lorentz transformations, where $\text{det}\left[\Lambda \right]=1$ ].

The formulation in the present paper began with the assumption of a test particle that has only one scalar property, other than mass, this being electric charge and which is presumed to be invariant under all Lorentz transformations. The consideration of sources other than those that are charge related would therefore appear to be outside the scope of the present paper. Insofar as experience indicates that magnetic charge is either nonexistent or extremely rare, or else doesn’t often interact with electric charge, and that invariance under time reversals and spatial-inversions has intrinsic intuitive appeal, the remainder of the paper proceeds with the assumption that ${H}^{\alpha }$ is identically zero.

In the symbol assignments of Equation (35), the quantity ${\mu }_{o}$ is a nonzero constant that one is free to select. For rationalized MKS (SI) units, ${\mu }_{o}=4\pi ×{10}^{-7}\text{N}/{\text{A}}^{2}$ , where the stated units are newtons per ampere squared or, equivalently, henrys per meter. [The reasons for this choice are primarily historical    . The resulting Maxwell’s equations, in either cgs or MKS units, in conjunction with the continuum extension of the Lorentz force law, yield the magnetostatic result

$\text{d}{F}_{12}=\frac{{\mu }_{o}}{4\pi }{I}_{1}{I}_{2}\left(\text{d}l×\int \frac{\text{d}{l}^{\prime }×\left(r-{r}^{\prime }\right)}{{|r-{r}^{\prime }|}^{3}}\right).$ (38)

Here $\text{d}{F}_{12}$ is the incremental force exerted by the electrical current ${I}_{1}$ in a thin wire on a length element $\text{d}l$ of a second wire that is carrying a current ${I}_{2}$ . The line integral passes along the closed circuit of the first wire in the direction of current flow. The element $\text{d}l$ is at the point $r$ , and ${r}^{\prime }$ denotes points on the first wire. In the original system of electromagnetic units, the unit of current, subsequently termed the abampere, was defined so that the coefficient ${\mu }_{o}/4\pi$ in the above formula was unity. Since force was in dynes, this rendered ${\mu }_{o}$ equal to $4\pi$ dynes per abampere squared. An international agreement in 1881 fixed the magnitude of the ampere to be 0.1 abampere. Since the dyne is 10−5 newtons, one has 1 dyne per abampere squared equal to 10−7 newtons per ampere squared, and hence the numerical value $4\pi ×{10}^{-7}$ results. This assignment and Equation (38) yield the standard definition of the ampere, coulombs per second, in terms of the hypothetical experiment: two parallel straight wires each carrying a current I are placed one meter apart. If I is one ampere, then the force per unit length exerted on one wire by the other is $2×{10}^{-7}$ newtons per meter.]

With the additional introduction of a symbol ${ϵ}_{o}$ , defined as

${ϵ}_{o}=\frac{1}{{\mu }_{o}{c}^{2}},$ (39)

the Equation (32), when written out explicitly and expressed in vector notation, yield:

$\nabla ×\left(B/{\mu }_{o}\right)-\frac{\partial \left({ϵ}_{o}E\right)}{\partial t}=j;$ (40)

$\nabla \cdot \left({ϵ}_{o}E\right)=\rho ;$ (41)

$\nabla ×E+\frac{\partial B}{\partial t}=0;$ (42)

$\nabla \cdot B=0.$ (43)

These equations, within which the speed of light c does not explicitly appear, have the form of Maxwell’s equations in rationalized MKS (SI) units that one commonly sees  in the literature, only with the substitutions in the first two of these of

$B/{\mu }_{o}=H;\text{\hspace{0.17em}}\text{\hspace{0.17em}}{ϵ}_{o}E=D.$ (44)

The partial differential equation for the conservation of charge,

$\nabla \cdot j+\frac{\partial \rho }{\partial t}=0,$ (45)

follows from Equations (40) and (41), because the divergence of the curl of $B$ is zero. An alternate derivation recognizes that, because ${\Phi }^{\alpha \beta }$ is antisymmetric, one has

$\frac{\partial }{\partial {X}^{\beta }}\frac{\partial }{\partial {X}^{\alpha }}{\Phi }^{\alpha \beta }=0,$ (46)

which, in conjunction with Equation (32), yields

$\frac{\partial {G}^{\beta }}{\partial {X}^{\beta }}=0,$ (47)

and this, with the symbol assignments in Equation (35), yields Equation (45).

7. Equivalence of Types of Charge

The association of the quantities $\rho$ and $j$ in the source terms with charge per unit volume and with flux of moving charge can, at one level, be regarded as a postulate and as merely defining the units of charge. Such, however, may have little intuitive appeal, and it is consequently desirable to appeal to some principle that seems intrinsically more plausible. To this purpose, reference is made to Newton’s original enunciation  of his third law.

Law 3. To every action there is always an opposite and equal reaction; in other words, the actions of two bodies upon each other are always equal and always opposite in direction.

To apply this law to the present circumstances, it is necessary to interpret the word always as meaning “in all instances” rather than “at every moment of time”. Insofar as forces are transmitted instantaneously from one body to another or else are constant in time, the usual interpretation applies with action interpreted as the vector force. If the finite time of propagation of changes in force is to be taken into account, then the suggested replacement is that time integrals of forces be equal and opposite. This is consistent with the general idea of what is often stated as the principle of reciprocity; effect per unit source strength of a source on a small “effect receiver” is the same when the roles, “source” and “effect receiver” are interchanged   . Here the terminology is intentionally vague; a precise statement, based on the hint provided by Feynman  , is:

The equations that govern the electromagnetic interaction between electrical charges must be such that, if a given moving charge is the source of an electromagnetic field that exerts a force on a second moving charge, then the same equations (Maxwell’s equations with source terms plus the Lorentz force law) apply for the determination of the force exerted on the first charge by the electromagnetic field caused by the second charge.

Implicit in this is that one can define units for charge so that all charges, whether sources or recipients of force, should have the same units.

To demonstrate that this reciprocity statement leads to the requirement of equal and oppositely directed time-integrals of forces, one begins with the hypothesis of $\rho$ and $j$ being appropriately interpreted in the manner stated above. Because one is here concerned with point particles, one expresses $\rho$ and $j$ as sums over point particles,

$\rho =\underset{Q}{\sum }\text{ }\text{ }{q}_{Q}\delta \left(x-{x}_{Q}\right);\text{ }j=\underset{Q}{\sum }\text{ }\text{ }{v}_{Q}{q}_{Q}\delta \left(x-{x}_{Q}\right)$ . (48)

Here ${q}_{Q}$ is the magnitude (possibly negative) of the Q-th charge, ${x}_{Q}$ is its position vector, and ${v}_{Q}$ is its velocity vector. The quantity $\delta \left(x-{x}_{Q}\right)$ is the three-dimensional δ-function, defined so that it is singularly concentrated at the instantaneous location of the particle and so that its volume integral is unity. The conservation of charge equation, Equation (45), holds trivially with these identifications, since

$\left({v}_{Q}\cdot \nabla +\frac{\partial }{\partial t}\right)\left(x-{x}_{Q}\right)=0.$ (49)

Also, the four-vector ${G}^{\alpha }$ , with the substitution of Equation (48) into Equation (35), continues to transform as a pseudo-four-vector of the time-reversal type. To verify this, one first notes that the substitution renders

${G}^{\alpha }={\mu }_{o}\underset{Q}{\sum }\text{ }\text{ }{q}_{Q}{U}_{Q}^{\alpha }\left({\tau }_{Q}\right)\delta \left(x-{x}_{Q}\right)\frac{\text{d}{\tau }_{Q}}{\text{d}t}.$ (50)

where ${U}_{Q}^{\alpha }$ is the Q-th charge’s four-vector velocity vector, and ${\tau }_{Q}$ is its proper time. One then notes that the quantity $\delta \left(x-{x}_{Q}\right)\text{d}{\tau }_{P}/\text{d}t$ transforms as a scalar, because $\text{d}{\tau }_{P}/\text{d}t$ is invariant under time-reversals, because the Jacobian for changing from one Minskowki space to another in a four-dimentional integration is unity (recall that the determinant of the Lorentz transformation is always 1 or −1), and because the time integration over a transformed time integral of $\text{d}{\tau }_{Q}/\text{d}t$ is $\Delta {\tau }_{Q}$ , which is an invariant. The quantity ${U}_{Q}^{\alpha }$ is a pseudo-four-vector of the time-reversal kind, and a scalar times such a four-vector is also a pseudo-four-vector of the same kind.

Then, to derive a relation with a resemblance to Newton’s third law, one conceives of fields, ${E}_{Q}$ and ${B}_{Q}$ , caused by charge ${q}_{Q}$ , these satisfying the equations

$\nabla ×\left({B}_{Q}/{\mu }_{o}\right)-\frac{\partial \left({ϵ}_{o}{E}_{Q}\right)}{\partial t}={q}_{Q}{v}_{Q}\delta \left(x-{x}_{Q}\right);$ (51)

$\nabla \cdot \left({ϵ}_{o}{E}_{Q}\right)={q}_{Q}\delta \left(x-{x}_{Q}\right);$ (52)

$\nabla ×{E}_{Q}+\frac{\partial {B}_{Q}}{\partial t}=0;$ (53)

$\nabla \cdot {B}_{Q}=0.$ (54)

The force exerted on charge ${q}_{P}$ because of the influence of charge ${q}_{Q}$ is consequently

${f}_{PQ}={q}_{P}{E}_{Q}+{q}_{P}{v}_{P}×{B}_{Q},$ (55)

where the two fields are understood to be evaluated at the position, ${x}_{P}$ , of the charge ${q}_{P}$ . If charge ${q}_{P}$ should also be the source of a field that exerts an influence on charge ${q}_{Q}$ , then there must be analogous relations that result from the above with the interchange of the subscripts P and Q.

A derivation, analogous to what one finds often used for proving the invariance of Green’s functions  under reciprocity, proceeds from the Maxwell equations for the fields ${B}_{Q}$ , ${E}_{Q}$ , ${B}_{P}$ , and ${E}_{P}$ , and yields the result

$\begin{array}{l}\underset{ij}{\sum }\frac{\partial {M}_{PQ,ij}}{\partial {x}_{i}}{e}_{j}+\frac{\partial {N}_{PQ}}{\partial t}\\ ={q}_{P}\left({E}_{Q}+{v}_{P}×{B}_{Q}\right)\delta \left(x-{x}_{P}\right)+{q}_{Q}\left({E}_{P}+{v}_{Q}×{B}_{P}\right)\delta \left(x-{x}_{Q}\right).\end{array}$ (56)

[This is a special case of a more general relation previously derived by Goedecke  ]. The most important feature of the left side of this equation, from the standpoint of the present discussion, is that it is a sum of derivatives. The abbreviated differentiated quantities are

$\begin{array}{c}{M}_{PQ,ij}=\frac{1}{{\mu }_{o}}\left(-\underset{k}{\sum }\text{ }{B}_{P,k}{B}_{Q,k}{\delta }_{ij}+{B}_{P,j}{B}_{Q,i}+{B}_{P,i}{B}_{Q,j}\right)\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{ϵ}_{o}\left(-\underset{k}{\sum }\text{ }{E}_{P,k}{E}_{Q,k}{\delta }_{ij}+{E}_{P,j}{E}_{Q,i}+{E}_{P,i}{E}_{Q,j}\right),\end{array}$ (57)

${N}_{PQ}={ϵ}_{o}\left({B}_{P}×{E}_{Q}+{B}_{Q}×{E}_{P}\right).$ (58)

Integrating both sides of Equation (56) over the volume of a large sphere surrounding the two charges, using Gauss’s theorem to convert some volume integrals to surface integrals, and then letting the radius of the sphere approach infinity, yields the result

${f}_{PQ}+{f}_{QP}=\frac{\text{d}}{\text{d}t}{\int }_{V}\text{ }{N}_{PQ}\text{d}V.$ (59)

[The vanishing of the surface integrations at infinite radius is not trivially true in all instances, but its plausibility is evident when one considers the transient case and allows that the disturbances propagate at a finite speed, so that they never reach infinity].

Equation (59) is the precise covariant statement of Newton’s third law in the present context, and its emergence justifies the physical equivalence of the charge in source terms to that of the test particle. If the particles are stationary, then the forces are equal and opposite, ${f}_{PQ}=-{f}_{QP}$ . It is beyond the scope of the present article to discuss all the circumstances when this is still a good approximation, but it is evident that time integrals over extended time intervals will tend to smooth out fluctuations, so that

$\int {f}_{PQ}\text{d}t\approx -\int {f}_{QP}\text{d}t.$ (60)

[A comparable discussion to what appears above can be found in the 1966 paper by Tessman  , who derived the electric and magnetic fields of an accelerating charge, using a number of plausible assumptions, but not explicitly invoking Maxwell’s equations. One of the assumptions, replacing Newton’s third law, was that “the total electric force which a stationary charge exerts upon a system of charges in steady state is equal in magnitude and opposite in direction to the total electric force exerted by the steady state system upon the steady charge”. This apparently was sufficient, although from the standpoint of the special theory of relativity, it is desirable to have a statement that does not require the existence of a special coordinate system in which all the relevant charges are stationary. Another assumption which Tessman makes is that the force exerted on a second charge at a given time is due only to the first charge’s dynamical state at the retarded time $t-\left(r/c\right)$ , where r is the distance from the first charge’s position at that time to the second charge’s position at the current time. Such is more in keeping with the stronger use of the principle of reciprocity, but the statement can become unwieldy for general formulations when one considers that there may be more than one point on the first charge’s trajectory that meets this criterion].

There is still one further requirement to be satisfied: both ${q}_{P}$ and ${q}_{Q}$ must have the same units. This is guaranteed if the numerical values that are assigned to both charges are measured in a consistent manner. Consideration of the case of only two charges cannot resolve this, as the forces depend only on the product of the two charges. One can calibrate the charges, however, if one has a third charge ${q}_{3}$ . The value of ${q}_{3}$ need not be known, but the forces exerted by it on ${q}_{P}$ and ${q}_{Q}$ in the static limit, in conjunction with Coulomb’s law (which is derivable from the equations given above), enable one to determine the ratio of the two charges. Then the Coulomb’s law relation for the force between the two suffices to determine the magnitude of either ${q}_{P}$ or ${q}_{Q}$ . Given the choice of ${ϵ}_{o}$ represented by Equation (39), this would determine the numerical value in coulombs of either charge. One does not necessarily measure charge in this manner, but the mere fact that some measurement procedure exists insures that one can always take the two charges to have the same units.

8. Concluding Remarks

The present paper presents an alternate derivation of a standard result, i.e., Maxwell’s equations. Whether it provides significant new insight, a significantly new way of thinking, or a much simpler approach, is, as Krefetz  wrote many years ago in a similar context, “after all, a matter of taste”. Newton’s laws are an attractive starting point, as they seem the most intuitively appealing of all the laws of physics, even though the idea of their universal applicability has long since been abandoned. Their vestiges remain in practically all of current physics, and it is difficult to conceive of a curriculum in physics or in one of the many branches of applied physics that does not begin with Newton’s laws.

Various options were left open throughout the derivation as to what is meant by the “covariance requirement”. In the bulk of the literature on special relativity, covariance is implicitly understood to mean covariance under the orthochronous proper Lorentz group, and thereby such literature has implicitly ignored the possibility of achieving or not achieving covariance under time reversals and spatial inversions. As Dixon  points out: “[Although] attention is normally restricted to coordinate systems in which the time coordinate increases into the future and in which the spatial coordinates are right handed, this is very different from the question of whether the laws of physics themselves determine a particular orientation or time-orientation in spacetime. However, no fundamental law outside the domain of quantum physics has yet shown such an asymmetry. [Although] it is inconvenient to develop the laws of physics without ever making definitions which depend on an arbitrary orientation or time-orientation, the behavior of equations under a change of convention is important”. In this spirit, the development in the present paper has been careful to specify the manner in which quantities such as the four-vector particle velocity and the two electromagnetic field tensors transform under the full Lorentz group.

One difference between the treatment here from that in many treatments of electromagnetism is that no potentials are introduced. In retrospect, the setting of the four-vector source term ${H}^{\alpha }$ to zero in the differential equations, Equation (32), is equivalent to assume that such potentials exist. [See, for example, Exercise 9 on page 140 of the text by Synge and Child  ]. The relevant question is what is intrinsically more plausible. The development here rests on a presumed symmetry in spacetime. If the equations are to be covariant under time-reversal or if they are to be covariant under spatial inversion, then potentials exist. In some literature, the competing argument is that magnetic monopoles do not exist. The argument here is that magnetic charge is precluded in source terms in equations of macroscopic physics if such equations are to be covariant under either spatial inversions or time-reversals (not necessarily both). In various places in the literature, one finds one or the other mentioned as precluding magnetic charge. In retrospect, it is clear why either type of covariance suffices, as $\nabla \cdot B$ changes sign under either time-reversal or spatial inversion.

In this respect, one may note an intriguing remark made some time ago by Schiff  : “It is usually said that Newton’s laws and Maxwell’s equations are time-reversible. These are time-reversible if there are no charges but no monopoles or if there are monopoles but no charges, but not if there are both”. The context does not make it clear exactly what Schiff meant, but the development here yields an interesting interpretation. The analysis here started with the hypothesis of the existence of a test particle with a scalar property. The ensuing result was that the fields that affect such a test particle are caused by the presence of other particles with the same type of scalar property. Perhaps other types of fields exist, such as are caused by some particles with a different type of scalar property. Given full covariance, our test particle cannot sense their presence. Suppose, on the other hand, one started with a test particle that had a scalar property which one chose to term “magnetic charge”. The same equations will result but one can always choose the symbols for the elements of the field tensors so that the new Lorentz force will involve a linear combination  of terms such as ${q}_{m}B$ and ${q}_{m}{v}_{P}×E$ . But in this case, time-reversal covariance or spatial inversion covariance will preclude the presence of source terms which involve electric charge. Thus, something like magnetic charge could very well exist, but at the macroscopic level its presence cannot be sensed in terms of forces on particles with electric charges, given that the equations are to be fully covariant.

Cite this paper: Pierce, A. (2019) Derivation of Maxwell’s Equations via the Covariance Requirements of the Special Theory of Relativity, Starting with Newton’s Laws. Journal of Applied Mathematics and Physics, 7, 2052-2073. doi: 10.4236/jamp.2019.79141.
References

   Einstein, A. (1905) Zur Elektrodynamik bewegter Körper. Annalen der Physik, 17, 891-921.
https://doi.org/10.1002/andp.19053221004

   Lorentz, H.A., Einstein, A., Minkowski, H. and Weyl, H. (1952) The Principle of Relativity. Dover, New York.

   Kilmister, C.W. (1970) Special Theory of Relativity. Pergamon, Oxford.

   Minkowski, H. (1908) Die Grundgleichungen für die electromagnetischen Vorgänge in bewegten Körpen. Nachtrichten der K. Gesellschaft der Wissenschaften zu Göttingen. Mathematisch-physikalische Klasse, 53-116.

   Dyson, F.J. (1990) Feynman’s Proof of the Maxwell Equations. American Journal of Physics, 58, 209-211.
https://doi.org/10.1119/1.16188

   Page, L. (1912) A Derivation of the Fundamental Relations of Electrodynamics from Those of Electrostatics. American Journal of Science, 34, 57-68.
https://doi.org/10.2475/ajs.s4-34.199.57

   Page, L. and Adams, N.I. (1940) Electrodynamics. Van Nostrand, New York, 129-154.

   Frisch, D.H. and Wilets, L. (1956) Development of the Maxwell-Lorentz Equations from Special Relativity and Gauss’s Law. American Journal of Physics, 24, 574-579.
https://doi.org/10.1119/1.1934322

   Swann, W.F.G. (1926) New Deductions of the Electromagnetic Equations. Physical Review, 28, 531-544.
https://doi.org/10.1103/PhysRev.28.531

   Elliott, R.S. (1966) Relativity and Electricity. IEEE Spectrum, 3, 140-152.
https://doi.org/10.1109/MSPEC.1966.5216743

   Tessman, J.R. (1966) Maxwell—Out of Newton, Coulomb, and Einstein. American Journal of Physics, 34, 1048-1055.
https://doi.org/10.1119/1.1972453

   Krefetz, E. (1970) A Derivation of Maxwell’s Equations. American Journal of Physics, 38, 513-516.
https://doi.org/10.1119/1.1976377

   Feynman, R.P. (1964) Lorentz Transformations of the Fields. In: Feynman, R.P., Leighton, R.B. and Sands, M., Eds., The Feynman Lectures on Physics, Mainly Electromagnetism and Matter, Addison-Wesley, Reading, 26.

   Podolsky, B. and Kunz, K.S. (1969) Fundamentals of Electrodynamics. Marcel Dekker, New York, 101-124.

   Landau, L.D. and Lifshitz, E.M. (1975) The Classical Theory of Fields. Pergamon, Oxford, 14-19, 44-46, 60-62, 66-75.

   Kobe, D.H. (1978) Derivation of Maxwell’s Equations from the Local Gauge Invariance of Quantum Mechanics. American Journal of Physics, 46, 342-348.
https://doi.org/10.1119/1.11327

   Kobe, D.H. (1980) Derivation of Maxwell’s Equations from the Gauge Invariance of Classical Mechanics. American Journal of Physics, 48, 348-353.
https://doi.org/10.1119/1.12094

   Kobe, D.H. (1984) Helmholtz Theorem for Antisymmetric Second-Rank Tensor Fields and Electromagnetism with Magnetic Monopoles. American Journal of Physics, 52, 354-358.
https://doi.org/10.1119/1.13668

   Kobe, D.H. (1986) Generalization of Coulomb’s Law to Maxwell’s Equations Using Special Relativity. American Journal of Physics, 54, 631-636.
https://doi.org/10.1119/1.14521

   Crater, H.W. (1994) General Covariance, Lorentz Covariance, the Lorentz Force, and Maxwell’s Equations. American Journal of Physics, 62, 923-931.
https://doi.org/10.1119/1.17682

   Jefimenko, O.D. (1996) Derivation of Relativistic Force Transformation Equations from Lorentz Force Law. American Journal of Physics, 64, 618-620.
https://doi.org/10.1119/1.18165

   Ton, T.-C. (1991) On the Time-Dependent, Generalized Coulomb, and Biot-Savart Laws. American Journal of Physics, 59, 520-528.
https://doi.org/10.1119/1.16812

   Griffiths, D.J. and Heald, M.A. (1991) Time-Dependent Generalizations of the Biot-Savart and Coulomb Laws. American Journal of Physics, 59, 111-117.
https://doi.org/10.1119/1.16589

   Crawford, F.S. (1992) Magnetic Monopoles, Galilean Invariance, and Maxwell’s Equations. American Journal of Physics, 60, 109-114.
https://doi.org/10.1119/1.16926

   Neuenschwander, D.E. and Turner, B.N. (1992) Generalization of the Biot-Savart Law to Maxwell’s Equations. American Journal of Physics, 60, 35-38.
https://doi.org/10.1119/1.17039

   Bork, A.M. (1963) Maxwell, Displacement Current, and Symmetry. American Journal of Physics, 31, 854-859.
https://doi.org/10.1119/1.1969140

   Goedecke, G.H. (2000) On Electromagnetic Conservation Laws. American Journal of Physics, 68, 380-384.
https://doi.org/10.1119/1.19441

   Hokkyo, N. (2004) Feynman’s Proof of Maxwell Equations and Yang’s Unification of Electromagnetic and Gravitational Aharonov-Bohm Effects. American Journal of Physics, 72, 345-347.
https://doi.org/10.1119/1.1617314

   Newton, I. (1999) The Principia: Mathematical Principles of Natural Philosophy. University of California Press, Berkeley, 416-417.

   Minkowski, H. (1909) Raum und Zeit. Physikalische Zeitschrift, 10, 104-111.

   Bergmann, P.G. (1976) Introduction to the Theory of Relativity. Dover, New York, 47-120.

   Einstein, A. (1916) Die Grundlagen der allgemeinen Relativitätstheorie. Annalen der Physik, Series 4, 49, 769-822.
https://doi.org/10.1002/andp.19163540702

   Low, F.E. (1997) Classical Field Theory: Electromagnetism and Gravitation. Wiley, New York, 252-255, 259-260, 269-270.

   Cunningham, E. (1909) The Principle of Relativity in Electrodynamics and an Extension Thereof. Proceedings of the London Mathematical Society, 8, 77-98.
https://doi.org/10.1112/plms/s2-8.1.77

   Bateman, H. (1910) The Transformation of the Electrodynamical Equations. Proceedings of the London Mathematical Society, 8, 223-264.
https://doi.org/10.1112/plms/s2-8.1.223

   Poincaré, H. (1906) Sur la dynamique de l’électron. Rendiconti del Circolo Matematico di Palermo, 21, 129-176.
https://doi.org/10.1007/BF03013466

   Schwartz, H.M. (1971) Poincaré’s Rendiconti Paper on Relativity. Part I. American Journal of Physics, 39, 1277-1294. Part II, ibid., 40(6), 862-872 (June 1972). Part III, ibid., 440(9), 1282-1287 (September 1972).

   Weinberg, S. (1995) The Quantum Theory of Fields. Vol. 1, Cambridge University Press, Cambridge, 55-58.
https://doi.org/10.1017/CBO9781139644167

   Schwartz, M. (1987) Principles of Electrodynamics. Dover, New York, 127-129.

   Heaviside, O. (1889) On the Electromagnetic Effects Due to the Motion of Electrification through a Dielectric. Philosophical Magazine, 27, 324-339.
https://doi.org/10.1080/14786448908628362

   Lorentz, H.A. (1952) The Theory of Electrons. Dover, New York, 14.

   Tolman, R.C. (1911) Note on the Derivation from the Principle of Relativity of the Fifth Fundamental Equation of the Maxwell-Lorentz Theory. Philosophical Magazine, Ser. 6, 21, 296-301.
https://doi.org/10.1080/14786440308637034

   Einstein, A. (1951) Autobiographical Notes. In: Schilpp, P.A., Ed., Albert Einstein: Philosopher-Scientist, Tudor Publ., New York, 63.

   Mø ller, C. (1952) The Theory of Relativity. Oxford University Press, Oxford, 113-114.

   Levi-Civita, T. (1977) The Absolute Differential Calculus. Dover, New York, 158-160.

   Synge, J.L. and Schild, A. (1978) Tensor Calculus. Dover, New York, 131-135.

   Schreier, O. and Sperner, E. (1955) Modern Algebra and Matrix Theory. Chelsea, New York, 89.

   Weil, J.F. (2002) Units of Measurement. In: McGraw-Hill Encyclopedia of Science and Technology, 9th Edition, Vol. 19, McGraw Hill, New York, 64-72.

   Bailey, A.E. (2002) Electrical Units and Standards. In: McGraw-Hill Encyclopedia of Science and Technology, 9th Edition, Vol. 6, McGraw Hill, New York, 226-231.

   Taylor, B.N. (2001) The International System of Units (SI). Natl. Inst. Stand. Technol. Spec. Publ. 330. U.S. Government Printing Office, Washington DC, 1-2, 6-7, 11, 18, 32-33.

   Stratton, J.A. (1941) Electromagnetic Theory. McGraw-Hill, New York, 2-6.

   Landau, L.D. and Lifshitz, E.M. (1960) Electrodynamics of Continuous Media. Pergamon, London, 288-289.

   Morse, P.M. and Feshbach, H. (1953) Methods of Theoretical Physics. Vol. I, McGraw-Hill, New York, 804-806, 834-837.

   Dixon, W.G. (1978) Special Relativity: The Foundation of Macroscopic Physics. Cambridge University Press, Cambridge, 89.

   Gold, T. (1967) The Nature of Time. Cornell University Press, Ithaca, 216.

   Rindler, W. (1989) Relativity and Electromagnetism: The Force on a Magnetic Monopole. American Journal of Physics, 57, 993-994.
https://doi.org/10.1119/1.15782

Top