Sunday, May 15, 2016

Dear Dr B: If photons have a mass, would this mean special relativity is no longer valid?

Einstein and Lorentz.
[Image: Wikipedia]
“[If photons have a restmass] would that mean the whole business of the special theory of relativity being derived from the idea that light has to go at a particular velocity in order for it to exist/Maxwell’s identification of e/m waves as light because they would have to go at the appropriate velocity is no longer valid?”

(This question came up in the discussion of a recent proposal according to which photons with a tiny restmass might cause an effect similar to the cosmological constant.)

Dear Brian,

The short answer to your question is “No.” If photons had a restmass, special relativity would still be as valid as it’s always been.

The longer answer is that the invariance of the speed of light features prominently in the popular explanations of special relativity for historic reasons, not for technical reasons. Einstein was lead to special relativity contemplating what it would be like to travel with light, and then tried to find a way to accommodate an observer’s motion with the invariance of the speed of light. But the derivation of special relativity is much more general than that, and it is unnecessary to postulate that the speed of light is invariant.

Special relativity is really just physics in Minkowski space, that is the 4-dimensional space-time you obtain after promoting time from a parameter to a coordinate. Einstein wanted the laws of physics to be the same for all inertial observers in Minkowski-space, ie observers moving at constant velocity. If you translate this requirement into mathematics, you are lead to ask for the symmetry transformations in Minkowski-space. These transformations form a group – the Poincaré-group – from which you can read off all the odd things you have heard of: time-dilatation, length-contraction, relativistic mass, and so on.

The Poincaré-group itself has two subgroups. One contains just translations in space and time. This tells you that if you have an infinitely extended and unchanging space then it doesn’t matter where or when you do your experiment, the outcome will be the same. The remaining part of the Poincaré-group is the Lorentz-group. The Lorentz-group contains rotations – this tells you it doesn’t matter in which direction you turn, the laws of nature will still be the same. Besides the rotations, the Lorentz-group contains boosts, that are basically rotations between space and time. Invariance under boosts tells you that it doesn’t matter at which velocity you move, the laws of nature will remain the same. It’s the boosts where all the special relativistic fun goes on.

Deriving the Lorentz-group, if you know how to do it, is a three-liner, and I assure you it has absolutely nothing to do with rocket ships and lasers and so on. It is merely based on the requirement that the metric of Minkowski-space has to remain invariant. Carry through with the math and you’ll find that the boosts depend on a free constant with the dimension of a speed. You can further show that this constant is the speed of massless particles.

Hence, if photons are massless, then the constant in the Lorentz-transformation is the speed of light. If photons are not massless, then the constant in the Lorentz-transformation is still there, but not identical to the speed of light. We already know however that these constants must be identical to very good precision, which is the same as saying the mass of photons must be very small.

Giving a mass to photons is unappealing not because it violates special relativity – it doesn’t – but because it violates gauge-invariance, the most cherished principle underlying the standard model. But that’s a different story and shall be told another time.

Thanks for an interesting question!


  1. It is quite easy for the photon to have a gauge invariant mass term -- easier than W,Z or gluons.

  2. So does that essentially mean that both Maxwell and Einstein were inspired by a lucky coincidence? That, for instance there isn't a particular speed required for the electromagnetic wave that Maxwell predicted? And that all the experiments that have found the speed to be invariant were just lucky that they happened to find light going at that speed?

  3. ... wouldn't it also mean most cosmological/astronomical measurements dependent on a specific speed of light that hasn't changed with time could be wrong?

  4. Brian,

    I don't know what you mean by "lucky coincidence". The mass of photons is either zero or very, very tiny, which means that assuming it's zero is an excellent approximation. I don't know why would one call this a coincidence. Having said that, as I alluded to in my post, electrodynamics is a U(1) gauge theory and giving photons a mass terms breaks the gauge symmetry (I don't know what the first commenter was referring to). It is still, however, Lorentz-invariant. Maxwell didn't know this though, so again, I don't know what you mean by coincidence. Best,


  5. What I meant was that if you are saying that light could travel at any speed in a particular medium then Maxwell's assumption that light was the electromagnetic wave he predicted, based on a specific speed, was based on a lucky coincidence. Similarly it seems implausible to make all the assumptions in cosmology and astronomy based on a constant speed of light.

    I was making these comments based on your remark 'If it has a restmass it can rest. I think you didn't quite get the point of the proposal. The proposal is that the photon does have a restmass. Hence, it can rest.' - which seems to imply that there is no specific speed a photon has to travel at in a particular medium, even if it is still restricted to below the 'ultimate' limit from relativity.

  6. Having never done the calculation myself, I think I had misunderstood the story of Einstein's calculation for gravitational deflection of starlight by the Sun. I understand there is a factor of 2 difference between the correct calculation and a naively taking the limit of zero mass for v=c projectiles in a Newtonian theory. I had interpreted this as a large, discrete difference between massive and massless particles, but I guess that factor of 2 is actually a function of velocity, approaching 2 as v->c?

  7. Brian,

    We don't live in a universe at zero temperature. Photons have energy, and the typical energies of photons around us are always very much larger than the restmass. It's called the "ultra-relativistic limit". Take a photon that is produced in any atomic transition or some electromagnetic wave created by an antenna etc, the photon's energy is huge compared to the restmass, hence it moves with a speed that is very close to the limiting speed in Special Relativity. Maxwell would never have observed any photon in rest, where would he have gotten one? This is not a coincidence, it's a consequence of the way our universe evolved.

    Forget about the photons for a moment, it's the same for neutrinos. Neutrinos have very tiny mass (compared to the other particles, still much large than the mass that the photon could have) hence all the neutrinos we observe around us are ultra-relativistic, moving almost at the speed of light.

  8. Jason,

    I can't recall the calculation off my head, but I don't think it's a continuous limit. The extra factor I think comes in roughly because in GR for light you have to take into account both the space and the time curvature. No matter how much you approximate this in Newtonian gravity, you'll never get the extra part. What you can do is the the opposite limit: take GR and go to v/c << 1 and you should find back the Newtonian case.

  9. There are at least two ways in which the photon can have a gauge invariant mass. One is to couple the photon to an antisymmteric tensor 'gauge field' B using a B\wedge F term. This is a two-point derivative coupling, whose coefficient provides a pole in the photon propagator. Classically, one gets massive wave equations for the field strength F. The B field carries one degree of freedom, so one might think it is dual to a scalar, but that is not completely correct. The free B field is dual to a scalar, but when it couples to the photon, the duality relation involves the photon as well. Not to mention that duality means that the degrees of freedom in B and the scalar are non-locally related. This is a fully gauge invariant, unitary renormalizable theory of a massive photon which can couple to charged particles exactly as usual.

    The other way of a giving a gauge invariant mass to the photon is the Higgs mechanism.

  10. Amitabha,

    You could do this, but I wouldn't call that a mass term. (It's not quadratic in A.) The Higgs mechanism leaves the photon massless. If you are speaking of some other symmetry breaking mechanicsm, well, that would have to break the gauge symmetry, which is what I'm saying.

  11. Sorry to keep asking questions, but it's interesting! I hadn't really thought about energy. So if a photon has mass and we know how energy varies with frequency, does that mean velocity should also vary with frequency?

  12. Sabine,

    The question was about photon mass (yes, I should have said 'gauge invariant mass' rather than 'gauge invariant mass term'). And a pole in the propagator, or a massive wave equation, are the signatures of a massive particle. So what I described produces a massive photon without breaking gauge invariance.

    The Higgs mechanism in the Standard Model leaves the photon massless, but that was not the one I meant. Consider the simple U(1) Higgs mechanism, in which the photon (and only the photon) couples to a complex scalar, which acquires a vev. This was the 'original' Higgs mechanism proposed by Higgs, which was used later in SU(2 )X U(1) context by Salam and Weinberg (and probably also Ward?). The U(1) Higgs mechanism produces a mass for the photon.

  13. Right, my misunderstanding was that the factor of 2 came from the photon's masslessness per se, when I guess in fact it shows up for any ultra-relativistic particle (otherwise the whole idea of a massive photon would have been put to bed observationally 100 year ago).

  14. "Poincaré-group" Poincaré group gauge theory maps Einstein-Cartan gravitation. Absent spacetime torsion, pseudo-Riemannian spacetime V4 is general relativity. Absent spacetime curvature, Weitzenböck spacetime A4 has a teleparallel gravitational energy-momentum pseudotensor anti-symmetric to parity transformation. Absent both, Minkowski spacetime M4 is special relativity.

    Mirror-image chiral pairs have divergent insertion energies within even trace spacetime torsion background (e.g., ECKS gravitation), fundamentally sourcing baryogenesis (6.1×10^(-10) bias, [hadrons less antihadrons] versus photons) given now weaker conservation laws. A geometric Eötvös experiment is diagnostic. Look.

  15. Sabine, nice answer to the question. I think one should clarify that the most important principle of spetial relativity, is the principle of relativity itself, when Maxwell theory is taking into the game (back then it was only applied the galilean transformations to classical mechanics). The second postulate of the constancy of ligth seems that only fixes the constant paramerter of the boosts to be equal to c. Otherwise, it is not really required. There are several papers about deriving spetial relativity without the second postualte.


  16. @Brian Clegg Massless boson photons detect no vacuum refraction, dispersion, dissipation, dichroism, or gyrotropy. Pulsar, gamma-ray burst, supernova, and quasar outputs across the spectrum arrive simultaneously and in register.

    Fermionic matter (quarks, hadrons) exhibits baryogenesis, parity violations, symmetry breakings, chiral anomalies, and Chern-Simons repair of Einstein-Hilbert action. Consider hadron-selective vacuum chiral anisotropy. Opposite shoes embed within chiral vacuum background (mount a left foot) with different energies. They vacuum free fall along non-identical minimum action trajectories. Theory is good, looking is better.

  17. So a non-zero photon mass would permit a photon state with zero momentum projection along its direction of travel. Would it also lead to a precession/ transition in polarization, similar to neutrino oscillations? If so, aren't there some atomic or nuclear transitions that might yield a fotrumfor placing a lower bound on the photon mass?

    Populate a spin 1, parity minus state with m=0 and measure the decay rate to a spin zero positive parity state relative to the decay rate of a spin 1, m=1 state.

  18. John,

    I don't know what your referral to neutrinos means. Neutrinos don't change their polarization, to begin with they're not bosons. Neutrino oscillation is a mixture between different particles, you need at least two particles for that.

    As to the additional degree of freedom, you have to somehow get rid of this if you want your theory to be viable, at least approximately. There are ways to do that. But really the point of this post wasn't to advocate massive photons, just to point out that it's a common but unfortunate confusion to believe special relativity has something to do with the speed of light.

  19. Hi Bee,
    Ok.I understand that SR needs only some v(max). But if c is not v(max) then Michelson Morley and other more refined experiments which prove frame independence of velocity of light will be in trouble. Probably this fact restricts deviation of c from v(max) to be very small. Do you agree? I suppose gauge invariance etc can be fixed!

  20. Always thought it was idiotic that Special Relativity is still mostly taught in the same convoluted way that Einstein originally established the theory. (Very doubtful that Einstein would still teach it that way if he was still around).

    As you pointed out, it just comes down to understanding what group is appropriate for spacetime. Other than that SR can be derived from the same principles as Hamiltonian mechanics when requiring the necessary group properties for the allowed spatial transformations. It is not beyond high-school level math to then show that only the Galilean and the Lorentz transformations satisfy these axioms (and the latter needs to be picked if there is an upper limit to signal propagation). A great pedagogical paper demonstrating this, was published as early as 1976. So why are most students still not introduced to SR in that way? Why, generally, this annoying focus of following the winding historical route when teaching physics?

  21. SH: "Einstein wanted the laws of physics to be the same for all inertial observers in Minkowski-space"

    I thought Minkowski suggested the geometric form some years after Einstein's publication and Einstein at first dismissed the idea. That seems to be the sequence given here:

  22. George,

    I didn't mean to say that Einstein knew this was what he wanted...

  23. Henning,

    I don't know, but I also find it annoying. When I was a teenager, I had great trouble following all these funky arguments with satellites and rocket ships and laser clocks and whatnot because one gets hung up on all kinds of irrelevant questions (what's the clock made of, can you actually see this, was the rocket ship always moving at this velocity, and so on). I was lucky that at some point I came across a textbook that said SO(3,1) is the symmetry-group of Minkowski space, and hence special relativity. I was like "Why didn't you tell me this earlier?"

  24. kashyap,

    Yes, as I said above, the difference depends on the mass of the photon, which has to be tiny. Actually the MM-type measurements do not give the strongest constraints on the photon mass.

  25. "Why didn't you tell me this earlier?" is a really good question. I'm glad you keep writing about the Poincaré group and c as its free parameter; I think it's useful.

    I have a question, though. Was this before you first encountered general curved spacetime? If so, what "clicked" for you when dealing with curved spacetime on which the Poincaré group doesn't behave naturally, or on spacetimes which admit no isometries (or really any preferred class of diffeomorphisms)?

  26. raattgift,

    It wasn't before I encountered the idea of curved spacetime, but before I knew how to mathematically deal with it. I don't recall having difficulties giving up on global symmetries if that is what you mean. I learned GR from Weinberg's textbook, and while in hindsight I don't like the emphasis he puts on the geodesic equation (which I believe is another one of these historical burdens), he basically starts with SR and then says, now let's do this for general spaces with general coordinate transformations and out comes GR (ok, I've oversimplified this, but basically that's the argument).

    Weinberg discusses spaces with isometries later in the book, it's a very useful introduction if you want to deal with the issue further (if you eg have to work with Stephani's book, you need to know already what this is all about).



  27. Can very much relate to the spaceship and synchronized clocks confusion. Fortunately, at the same time when this confused me in physics class we also learned about linear algebra and I realized that applying this to a Minkowski diagram was all it took.

    Much later I had a similar why-the-hell-did-they-not-teach-me-this-earlier moment when I came across Hestenes Geometric Algebra.

    Tying back to your most recent post on the arxiv filter, I wonder how Hestenes GA work would have fared if such a filter would have existed back then. My guess, probably not very well, since he introduced a completely new language.

  28. Henning,

    Oh, yes, the geometric algebra. I only learned of this quite recently, but my reaction was also "why didn't I know of this earlier." As a student I spent quite some time with Clifford algebras, and this would have been dramatically helpful. Now I kind of feel to old to spend much time on it (time I don't have, and also, I don't really need it). In any case, it's arguably a problem that isn't quite as widely spread as the lasers and rocket ships.

    Yes, it is actually problems like this I have in mind, where someone introduces a new language, something that's not very tightly connected to the existing literature, and must at first glance look both unmotivated and useless. A similar concern would be eg Connes' non-comm "theory of everything" idea. If Connes wasn't Connes, this might have gone down the crackpot flush. And that's a huge problem - if the success of an idea depends on the persons' previous history, that's no good.

  29. For purposes of deriving the Lorentz transformations, I don’t think we should be too critical of the assumption that light has no rest mass. As you said, there are good reasons (gauge invariance) for thinking that electromagnetic radiation is massless energy. The fact that people sometimes investigate the possibility of a tiny non-zero rest mass for light (which would be very unsettling for our fundamental theories) doesn’t really invalidate the practice of treating light as massless energy in the context of current theory (as Einstein did). It would be pedantic to criticize someone for using the phrase “speed of light” as a synonym for “the speed of massless energy”, except in the unusual context where that equivalence is being questioned.

    Also, it’s clearly not sufficient for a physical theory to simply assert that the invariant speed c has a finite value. We must say what that value is, in recognizable units of measure, in order to have a quantitative theory that can be compared with experiment.

    If we postulate that the speed of light (i.e., massless energy) has the invariant value c in terms of any system of inertial coordinates, this amounts to asserting the invariance of null intervals, meaning that we seek linear transformations such that the relation c2 dt2 = dx2 + dy2 + dz2 is invariant. (See Einstein’s 1907, for example.) On the other hand, if we take the (ostensibly more enlightened) approach of postulating Minkowski spacetime (defined as the manifold with the “metric” that leaves the quadratic form c2 dt2 – dx2 – dy2 – dz2 invariant), then it suffices to seek linear transformations such that c2 dt2 = dx2 + dy2 + dz2 is invariant. Well, this is exactly what we did before. The only difference is that the ‘invariant light speed’ method just assumed the invariance of null intervals (which is all we need for the derivation), and deduced the invariance of all intervals, whereas the ‘Minkowski spacetime’ postulate already entails the invariance of all intervals from the start, which is more than we need to assume. This is ironic, because people sometimes think that taking Minkowski spacetime as an axiom is more economical, but it’s actually less economical (in this sense) than the bare invariant light speed axiom.


Comment moderation on this blog is turned on.
Submitted comments will only appear after manual approval, which can take up to 24 hours.