# Naturalness is fuzzy, subjective, model-dependent, and uncertain, too

Luboš Motl, September 11, 2015

In an ordinary non-supersymmetric model of particle physics such as the Standard Model, the masses of (especially) scalar particles are «unprotected» which is why they «love» to be corrected by pretty much any corrections that offer their services.

For example, if you interpret the Standard Model as an effective theory approximating a better but non-supersymmetric theory that works up to the GUT scale or Planck scale, fifteen orders of magnitude above the Higgs mass, there will be assorted loop diagrams that contribute to the observable mass of the Higgs boson.

\[ m_h^2 = \dots + 3.5 m_{Pl}^2 — 2.7 m_{Pl}^2 + 1.9 m_{GUT}^2 — \dots \] and when you add the terms up, you should better obtain the observed value

\[ m_h^2 = [(125.1\pm 0.3)GeV]^2 \] or so. It seems that we have been insanely lucky to get this small result. Note that the lightness of all other known massive elementary particles is derived from the lightness of the Higgs. Terms that were \(10^{30}\) times larger than the final observed Higgs mass came with both signs and (almost) cancelled themselves with a huge relative accuracy.

A curious scientist simply has to ask: Why? Why does he have to ask? Because the high-energy parameters that the individual terms depended upon had to be carefully adjusted, or fine-tuned, to obtain the very tiny final result. This means that the «qualitative hypothesis», the Standard Model (or its completion relevant near the GUT scale or the Planck scale) with arbitrary values of the parameters only predicts the outcome qualitatively similar to the observed one – a world with a light Higgs – with a very low probability.

If you assume that the probabilistic distribution on the parameter space is «reasonably quasi-uniform» in some sense, most of the points of the parameter space predict totally wrong outcomes. So the conditional probability \(P(LHC|SM)\) where LHC indicates the masses approximately observed at the LHC and SM is the Standard Model with arbitrary parameters distributed according to some sensible, quasi-uniform distribution, is tiny, perhaps of order \(10^{-30}\), because only a very tiny portion of the parameter space gives reasonable results.

By Bayes’ theorem, we may also argue that \(P(SM|LHC)\), the probability of the Standard Model given the qualitative observations at the LHC, is extremely tiny, perhaps \(10^{-30}\), as well. Because the probability of the Standard Model is so small, we may say that in some sense, the Standard Model – with the extra statistical assumptions above – has been falsified. It’s falsified just like any theory that predicts than an actually observed effect should be extremely unlikely. For example, a theory claiming that there is no Sun – the dot on the sky is just composed of photons that randomly arrive from that direction – becomes increasingly indefensible as you see additional photons coming from the same direction. 😉

Before you throw the Lagrangian of the Standard Model to the trash bin, you should realize that what is actually wrong isn’t the Lagrangian of the Standard Model, an effective field theory, itself. What’s wrong are the statistical assumptions about the values of the parameters. Aside from the Standard Model Lagrangian, there exist additional laws in a more complete theory that actually guarantee that the value of the parameters is such that the terms contributing to the squared Higgs mass simply have to cancel each other almost exactly.

Supersymmetry is the main system of ideas that is able to achieve such a thing. The contribution of a particle, like the top quark, and its superpartner, the stop, to \(m_h^2\) are exactly the same, up to the sign, so they cancel. More precisely, this cancellation holds for unbroken supersymmetry in which the top and stop are equally heavy. We know this not to be the case. The top and the stop have different masses – or at least, we know this for other particle species than the stop.

But even when the top and the stop have different masses and supersymmetry is spontaneously broken, it makes the fine-tuning problem much less severe. You may clump the contributions from the particles with the contributions from their superpartners into «packages». And these «couples» almost exactly cancel, up to terms comparable to the «superpartner scale». This may be around \((1TeV)^2\), about 100 times higher than \(m_h^2\). So as long as the superpartner masses are close enough to the Higgs mass, the fine-tuning problem of the Standard Model becomes much less severe once supersymmetry is added.

Realistically, \(m_h^2 \approx (125GeV)^2\) is obtained from the sum of terms that are about 100 times higher than the final result. About 99% of the largest term is cancelled by the remaining terms and 1% of it survives. Such a cancellation may still be viewed as «somewhat unlikely» but should you lose sleep over it? Or should you discard the supersymmetric model? I don’t think so. You have still improved the problem with the probability from \(10^{-30}\) in the non-supersymmetric model to something like \(10^{-2}\) here. After all, supersymmetry is not the last insight about Nature that we will make and the following insights may reduce the degree of fine-tuning so that the number \(10^{-2}\) will be raised to something even closer to one. When and if the complete theory of everything is understood and all the parameters are calculated, the probability that the right theory predicts the observed values of the parameters will reach 100%, of course.

I think that the text above makes it pretty clear that the «naturalness», the absence of unexplained excessively accurate cancellations, has some logic behind it. But at the same moment, it is a heuristic rule that depends on many ill-defined and fuzzy words such as «sensible», «tolerable», and many others. When you try to quantify the degree of fine-tuning, you may write the expressions in many different ways. They will yield slightly different results.

At the end, all these measures that quantify «how unnatural» a model is may turn out to be completely wrong, just like the non-supersymmetric result \(10^{-30}\) was shown to be wrong once SUSY was added. When you add new principles or extract the effective field theory from a more constrained ultraviolet starting point, you are effectively choosing an extremely special subclass of the effective field theories that were possible before you embraced the new principle (such as SUSY, but it may also be grand unification, various integer-valued relationships that may follow from string theory and tons of related things).

So one can simply never assume that a calculation of the «degree of naturalness» is the final answer that may falsify a theory. It’s fuzzy because none of the expressions are clearly better than others. It’s subjective because people will disagree what «feels good». And it’s model-dependent because qualitatively new models produce totally different probability distibutions on the parameter spaces.

Moreover, the naturalness as a principle – even when we admit it is fuzzy, subjective, and model-dependent – is still uncertain. It may actually contradict principles such as the Weak Gravity Conjecture – which is arguably supported by a more nontrivial, non-prejudiced body of evidence. And the smallness of the Higgs boson mass or the cosmological constant may be viewed as indications that something is wrong with the naturalness assumption, if not disproofs of it.

Today, rather well-known model builders Baer, Barger, and Savoy published a preprint that totally and entirely disagrees with the basic lore I wrote above. The paper is called

Upper bounds on sparticle masses from naturalness or how to disprove weak scale supersymmetry

They say that the «principle of naturalness» isn’t fuzzy, subjective, model-dependent, and uncertain. Instead, it is objective, model-independent, demanding clear values of the bounds, and predictive. Wow. I honestly spent some time by reading more or less the whole paper. Can there be some actual evidence supporting these self-evidently wrong claims?

Unfortunately, what I got from the paper was just laughter, not insights. These folks can’t possibly be serious!

They want to conclude that a measure of fine-tuning \(\Delta\) has to obey \(\Delta\lt 30\) and derive upper limits on the SUSY Higgs mixing parameter \(\mu\) (it shall be below \(350 GeV\)) or the gluino mass (they raise the limit to \(4 TeV\)). But what are the exact equations or inequalities from which they deduce such conclusions and, especially, what is their evidence in favor of these inequalities?

I was searching hard and the only thing I found was a comment that some figure is «visually striking» (meaning that it wants you to say Someone had to fudge it). Are you serious? Will particle physicists calculate particle masses by measuring how much twisted their stomach becomes when they look at a picture?

«The visually striking» pictures obviously show columns that are not of the same order. But does it mean that the model is wrong? When stocks fell almost by 5% a day, the graphs of the stock prices were surely «visually striking» and many people couldn’t believe and many of those couldn’t sleep, either. But the drop was real. And it wasn’t the only one. Moreover, it’s obvious that different investors have different tolerance levels. In the same way, particle physicists have different tolerance levels when it comes to the degree of acceptable fine-tuning.

They want to impose \(\Delta\lt 30\) on everyone, as a «principle», but it’s silly. No finite number on the right hand side would define a good «robust law of physics» because there are no numbers that are so high that they couldn’t occur naturally. 🙂 But if you want me to become sure about the falsification of a model – with some ideas about the distribution of parameters – you would need \(\Delta\geq 10^6\) for me to feel the same «certainty» as I feel when a particle is discovered at 5 sigma, or \(10^3\) to feel the 3-sigma-like certainty.

Their proposal to view \(\Delta\gt 30\) as a nearly rigorous argument falsifying a theory is totally equivalent to using 2-sigma bumps to claim discoveries. The isomorphism is self-evident. When you add many terms of both signs, the probability that the absolute value of the result is less than 3 percent of the absolute value of the largest term is comparable to 3 percent. It’s close to 5 percent, the probability that you get a 2-sigma or greater bump by chance!

And be sure that because we have only measured one Higgs mass, it’s *one bump*. To say that \(m_h^2\) isn’t allowed to be more than 30 times smaller than the absolute value of the largest contribution is just like saying a 2-sigma bump is enough to settle any big Yes/No question in physics. Even more precisely, it’s like saying that there won’t ever be more than 2-sigma deviations from a correct theory. Sorry, it’s simply not true. You may *prefer* a world in which the naturalness could be used to make similarly sharp conclusions and falsify theories. But it is not our world. In our world with the real laws of mathematics, this is simply not possible. Even \(\Delta\approx 300\) is as possible as the emergence of a 3-sigma bump anywhere by chance. Such things simply may happen.

Obviously, once we start to embrace the anthropic reasoning or multiverse bias, much higher values of \(\Delta\) may become totally tolerable. I don’t want to slide into the anthropic wars here, however. My point is that even if we reject all forms of anthropic reasoning, much higher values of \(\Delta\) than thirty may be OK.

But what I also find incredible is their degree of worshiping of random, arbitrary formulae. For example, Barbieri and Giudice introduced this naturalness measure

\[ \Delta_{BG}=\max_i \mid { \frac{ \partial\log m_Z^2 }{ \partial \log p_i } }\mid \]

You calculate the squared Z-boson mass out of many parameters \(p_i\) of the theory at the high scale. The mass depends on each of them, you measure the slope of the dependence in the logarithmic fashion, and pick the parameter which leads to the steepest dependence. This steepest slope is then interpreted as the degree of fine-tuning.

Now, this BG formula – built from previous observations by Ellis and others – is more quantitative than the «almost purely aesthetic» appraisals of naturalness and fine-tuning that dominated the literature before that paper. But while this expression may be said to be well-defined, it’s extremely arbitrary. The degree of arbitrariness almost exactly mimics the degree of vagueness that existed before. So even though we have a formula, it’s not a real quantitative progress.

When I say that the formula is arbitrary, I have dozens of particular complaints in mind. And there surely exist hundreds of other complaints you could invent.

First, for example, the formula depends on a particular parameterization of the parameter space in terms of \(p_i\). Any coordinate redefinition – a diffeomorphism on the parameter space – should be allowed, shouldn’t it? But a coordinate transformation will yield a different value of \(\Delta\). Why did you parameterize the space in one way and not another way? Even if you banned general diffeomorphisms, there surely exist transformations that are more innocent, right? Like the replacement of \(p_i\) by their linear combinations. Or products and ratios, and so on.

Second, and it is related, why are there really the logarithms? Shouldn’t the expression depend on the parameters themselves, if they are small? Shouldn’t one define a natural metric on the parameter space and use this metric to define the naturalness measure?

Third, why are we picking the maximum?

\[ \max K_i \] may be fine to pick a representative value of many values \(K_i\). But we may also pick

\[ \sum_i |K_i| \] or, perhaps more naturally,

\[ \sqrt{ \sum_i |K_i|^2 }. \]

For a «rough understanding», such changes usually don’t change the picture dramatically. But if you wanted exact bounds, it’s clearly important which of those expressions is picked and why. You would need some *evidence* that favors one formula or another. There is no theoretical evidence and there is no empirical evidence, either.

The bounds defined by the three alternative expressions above may be called a «cube», a «diamond», and a «ball». The corresponding limits on the superpartner masses may have similar shapes. The authors end up claiming things like \(\mu\lt 350GeV\) out of random assumptions of the form \(\Delta\lt 30\) – even though the \(\Delta\) could be replaced by a different formula and \(30\) could be replaced by a different number. Why is their starting point better than the «product»? Why don’t they directly postulate an inequality on \(\mu\)?

Fourth, shouldn’t the formula for the naturalness measure get an explicit dependence on the number of parameters? If your theory has many soft parameters, you may view it as an unnatural theory regardless of the degree of fine-tuning in each parameter because it becomes easier to make things look natural if there are many moving parts that may conspire (or because there are many theories with many parameters which should make you reduce their priors). However, you could also present the arguments going exactly in the opposite direction. When there are many parameters \(p_i\) leading to many slopes \(K_i\), it’s statistically guaranteed that at least one of them will turn out to be large, by chance, right? So perhaps, for a large number of parameters, you should tolerate higher values of \(\Delta\), too.

One can invent infinitely many arguments and counter-arguments that will elevate or reduce the tolerable values of \(\Delta\) for one class of theory differently than for another class of theories, arguments and counter-arguments that may take not just the number of parameters but *any* qualitative (and quantitative) property of the class of theories into account! The uncertainty and flexibility has virtually no limits, and for a simple reason: we are basically talking about the «right priors» for all hypotheses in physics. Well, quite generally in Bayesian inference, there can’t be any universally «right priors». Priors are unavoidably subjective and fuzzy. Only the posterior or «final» answers or probabilities (close to zero or one) after a sufficient body of relevant scientific evidence is collected may be objective.

There are lots of technical complications like that. And even if you forgot about those and treated the formula \(\Delta_{BG}\) as a canonical gift from the Heaven, which it’s obviously not, what should be the upper bound that you find acceptable? Arguing whether it’s \(\Delta\lt 20\) or \(\Delta \lt 300\) is exactly equivalent to arguments whether 2-sigma or 3-sigma bumps are enough to settle a qualitative question about the validity of a theory. You know, none of them is enough. But even if you ask «which of them is a really strong hint», the answer can’t be sharp. The bound is unavoidably fuzzy.

They also discuss a similar naturalness measure

\[ m_Z^2 = \sum_i K_i m_Z^2, \quad \Delta_{EW}=\max |K_i| \]

You write the squared Z-boson mass as the sum of many terms. I wrote them as \(K_i\) times \(m_Z^2\) so that the sum \(\sum_i K_i = 1\), and the degree of naturalness is the greatest absolute value among the values \(|K_i|\). If there is a cancellation of large numbers, the theory is said to be highly EW-fine-tuned.

Again, when you write the squared Z-boson mass according to a particular template, the naturalness measure above becomes completely well-defined. But this well-definedness doesn’t help you at all to answer the question Why. Why is it exactly this formula and not another one? Why are you told to group the terms in one way and not another way?

The grouping of terms is an extremely subtle thing. An important fact about physics is that only the total result for the mass, or a scattering amplitude, is physically meaningful. The way how you calculate it – how you organize the calculation or group the contributions – is clearly unphysical. There are many ways and none of them is «more correct» than all others.

At the beginning, I told you that the contributions of the top and the stop to the squared Higgs mass may be huge but when you combine them into the top-stop contribution, this contribution is much smaller than the GUT or Planck scale: it is comparable to the much lower superpartner scale. So the «impression about fine-tuning» clearly depends on how you write the thing which is unphysical.

There are lots of numbers in physics that are much smaller than the most naive order-of-magnitude estimates. The cosmological constant is the top example and its tiny size continues to be largely unexplained. The small Higgs boson mass would be unexplained without SUSY etc. But there are many more mundane examples. In those cases, there is no contradiction because we can explain why the numbers are surprisingly small.

The neutron’s decay rate is very low – the lifetime is anomalously long, some 15 minutes, vastly longer than other lifetimes of things decaying by a similar beta-decay. It’s because the phase space of the final 3 products to which neutron decays is tiny. It’s because the neutron mass is so close to the sum of the proton mass and the electron mass (and the neutrino mass, if you don’t neglect it). The suppression is by a power law.

But take an even more mundane example. The strongest spectral line emitted by the Hydrogen atom. I mean the line between \(n=1\) and \(n=2\). Its energy is \(13.6eV(1-1/4)\), some ten electronvolts. You could say that it is insanely fine-tuned because it’s obtained as the difference between two energies/masses of the Hydrogen atom in two different states. The hydrogen atom’s mass is almost \(1GeV\), mostly composed of the proton mass.

Why does the photon mass end up being \(10eV\), 100 million times lower than the latent energy of the Hydrogen atom? Well, we have lots of mundane explanations. First, we know why the two masses are almost the same because the proton is doing almost nothing during the transition. (This argument is totally analogous to the aforementioned claim in SUSY that the top and the stop may be assumed to be similar.) The complicated motion in the Hydrogen atom is only due to a part of the atom, the electron, whose rest mass is just \(511keV\), almost 2,000 times lighter than the proton. This rest mass of the electron is still 50,000 times larger than the energy of the photon. Why?

Well, it’s because the binding energy of the electron in the hydrogen atom is comparable to the kinetic energy. And the kinetic energy is much lower than the rest mass because the speed of the electron is much smaller than the speed of light. It’s basically the fine-structure constant times the speed of light, as one may derive easily.

Now, why is the fine-structure constant, the dimensionless strength of electromagnetism, so small? It is \(1/137.036\) or so. Well, it is just what it is. We may find excuses as well as formulae deriving the value from constants considered more fundamental these days. First, one could argue that \(4\pi / 137\) and not \(1/137\) is the more natural constant to consider. So a part of the smallness of \(1/137\) is because it implicitly contains some factor of \(1/4\pi\), about one twelfth, you could be more careful about.

The remaining constant may be derived from the electroweak theory. The fine-structure constant ends up smaller than you could expect because 1) its smallness depends on the smallness of two electroweak coupling constants, 2) it’s mostly a \(U(1)\) coupling and such couplings are getting weaker at lower energies. So a decent value of the coupling at the GUT scale simply produces a rather small value of the fine-structure constant at low energies (below the electron mass).

We don’t say that the fine-structure constant is unnaturally small because the GUT-like theories or, which is better, stringy vacua that we have in mind that may produce electromagnetism including predictions of parameters may produce values like \(1/137\) easily. But before we knew these calculations, we could have considered the smallness of the fine-structure constant to be a fine-tuning problem.

My broader point is that there are ways to explain the surprise away. More objectively, we can derive the energy of the photon emitted by the Hydrogen atom from a more complete theory, the Standard Model or a GUT theory, and a part of the surprise about the smallness of the photon energy goes away. We would still need some explanation why the electron Yukawa coupling is so tiny and why the electron mass ends up being beneath the proton mass, and lots of other things. But there will always be a part of the explanation (of the low photon energy) of the kind «bound states where objects move much slower than the speed of light» produce small changes of the energy in the spectrum, and similar things. And there will be wisdoms such as «it’s normal to get bound states with low speeds, relatively to the speed of light, because couplings often want to be orders of magnitude lower than the maximum values».

The attempts to sell naturalness as some strict, sharp, and objective law are nothing else than the denial of similar explanations in physics – in the future physics but maybe even in the past and established physics. *Every* explanation like that – a deeper theory from which we derive the current approximate theories; but even a method to organize the concepts and find previously overlooked patterns – change the game. They change our perception of what is natural. To say that one already has the right and final formulae deciding on how much a theory is natural – or right – is equivalent to saying that we won’t learn anything important in physics in the future, and I think that it’s obviously ludicrous.

The naturalness reasoning is a special example of Bayesian inference applied to probability distributions on the parameter spaces. So we need to emphasize that the conclusions depend on the Bayesian probability distributions. But a defining feature of Bayesian probabilities is that they change – or should be changed, by Bayes’ theorem – whenever we get new evidence. It follows that in the future, after new papers, our perception of naturalness of one model or another will unavoidably change and attempts to codify a «current» formula for the naturalness are attempts to sell self-evidently incomplete knowledge as the complete one. More complete theories will tell us more about the values of parameters in the current approximate theories – and they will be able to say whether our probability distributions on the parameter spaces were successful bookmaker’s guesses. The answer may be Yes, No, or something in between. In some cases, the guess will be right. In others, it will be wrong but it will look like the bookmaker’s bad luck. But there will also be cases in which the bookmakers will be seen to have missed something – making it obvious that in general, the bookmakers’ odds are something else than the actual results of the matches! It’s the true results of the matches, and not some bookmakers’ guesses at some point, that define the truth that physics wants to find.

At the end, I believe that physicists such as the authors of the paper I criticized above are motivated by some kind of «falsifiability wishful thinking». They would like if physics became an Olympic discipline where you may organize a straightforward race and you may declare the winners and losers. Pay $10 billion for the LHC and it will tell you whether SUSY is relevant for the weak scale physics. But physics is not an Olympic discipline. Physics is the search for the laws of Nature. It is the search for the truth. And a part of the truth is that there are no extremely simple and universal solutions to problems or methods to answer difficult questions.

If a model can describe the observed physics – plus some bumps – with \(\Delta=10\) instead of \(\Delta=1,000\) of its similar competitor, I may prefer the former even though the value of \(\Delta\) obviously won’t be the only thing that matters.

But when you compare two extremely different theories or classes of theories – and supersymmetric vs non-supersymmetric models are a rather extreme example – it becomes virtually impossible to define a «calibration» of the naturalness measure that would be good for both different beasts. The more qualitatively the two theories or classes of theories differ, the more different their prior probabilities may be, and the larger is the possible multiplicative factor that you have to add to \(\Delta\) of one theory to make the two values of \(\Delta\) comparable.

And suggesting that people should embrace things like \(\Delta\lt 30\) with some particular definition of \(\Delta\) is just utterly ludicrous. It’s a completely arbitrary bureaucratic restriction that someone extracted out of thin air. No scientist can take it seriously because there is zero evidence that there should be something right about such a particular choice.

If the LHC doesn’t find supersymmetry during the \(13TeV\) run or the \(14TeV\) run, it won’t mean that SUSY can’t be hiding around the corner that would be accessible by a \(42TeV\) or \(100TeV\) collider. It’s spectacularly obvious that no trustworthy argument that would imply such a thing may exist. If nothing will qualitatively change about the available theoretical paradigms, I would even say that most of the sensible phenomenologists will keep on saying that SUSY is most likely around the corner, despite the fact that similar predictions will have failed.

At least the phenomenologists who tend to pay attention to naturalness will say so. By my assumptions, SUSY will remain the most natural explanation of the weak scale on the market. At the same moment, the naturalness-avoiding research – I also mean the anthropic considerations but not only anthropic considerations – will or should strengthen. But up to abrupt flukes, all these developments will be gradual. It can’t be otherwise. When experiments aren’t creating radical, game-changing new data, there can’t be any game-changing abrupt shift in the theoretical opinions, either.