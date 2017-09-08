© Tim Hayward

Using probability calculus to evaluate evidence for alternative hypotheses, including deception operations

For two alternative hypotheses, H 1 and H 2 , the evidence favouring H 1 over H 2 is evaluated by comparing how well H 1 would have predicted the observations with how well H 2 would have predicted the observations.

and H , the evidence favouring H over H is evaluated by comparing how well H would have predicted the observations with how well H would have predicted the observations. We cannot evaluate the evidence for or against a single hypothesis, only the evidence favouring one hypothesis over another.

The evidence favouring one hypothesis over another can be calculated without having to specify your prior degree of belief in which of these two hypotheses is correct. Two people may have different priors, but their calculations of the strength of evidence favouring one hypothesis over another should agree if they agree on what they would expect to observe if each of these hypotheses were true.

Your prior odds encode your degree of belief favouring H 1 over H 2 , before you have seen the observations . Priors are subjective: one person may assign prior odds of 100 to 1 favouring H 1 over H 2 , while another may believe that both hypotheses are equally probable.

over H , . Priors are subjective: one person may assign prior odds of 100 to 1 favouring H over H , while another may believe that both hypotheses are equally probable. The likelihood of a hypothesis is the conditional probability of the observations given that hypothesis. To evaluate it, we have to envisage what would be expected to happen if the hypothesis were true. We can think of the likelihood as measuring how well the hypothesis can predict the observation.

of a hypothesis is the conditional probability of the observations given that hypothesis. To evaluate it, we have to envisage what would be expected to happen if the hypothesis were true. We can think of the likelihood as measuring how well the hypothesis can predict the observation. Likelihoods of hypotheses measure the relative support for those hypotheses; they are not the probabilities of those hypotheses.

The ratio of the likelihood of H 1 to the likelihood of H 2 is called the Bayes factor or simply the likelihood ratio. In recognition of his mentor, Good called it the "Bayes-Turing factor".

to the likelihood of H is called the or simply the likelihood ratio. In recognition of his mentor, Good called it the "Bayes-Turing factor". It is only through the likelihood ratio that your prior odds are modified by evidence to posterior odds. All the evidence on whether the observations support H 1 or H 2 is contained in the likelihood ratio: this is the likelihood principle.

Examples

You have two alternative hypotheses about a coin that is to be tossed: H 1 that the coin is fair, and H 2 that the coin is two-headed. In most situations your prior belief would be that H 1 is far more probable than H 2 . Given the observation that the coin comes up heads when tossed once, the likelihood of a fair coin is 0.5 and the likelihood of a two-headed coin is 1. The likelihood ratio favouring a two-headed coin over a fair coin is 2. This won't change your prior odds much. If, after the first ten tosses, the coin has come up heads every time, the likelihood ratio is 210=1024, perhaps enough for you to suspect that someone has got hold of a two-headed coin. Hypothesis H 1 is that all crows are black (as in eastern Scotland), and hypothesis H 2 is that only 1 in 8 crows are black (as in Ireland where most crows are grey). The first crow you observe is black. Given this single observation, the likelihood of H 1 is 1, and the likelihood of H 2 is 1/8. The likelihood ratio favouring H 1 over H 2 is 8. So if your prior odds were 2 to 1 in favour of H 1 , your posterior odds, after this first observation, will be 16 to 1. This posterior will be your prior when you next observe a crow. If this next crow is also black, the likelihood ratio contributed by this observation is again 8, and your posterior odds favouring H 1 over H 2 will be updated to (16×8=128) to 1.

Hempel's paradox

Rockets used in the alleged chemical attack in Ghouta in 2013: evidence for or against Syrian government responsibility?

H 1 states that a chemical attack was carried out by the Syrian military, under orders from President Assad. The proponents of this hypothesis include the US, UK and French governments.

states that a chemical attack was carried out by the Syrian military, under orders from President Assad. The proponents of this hypothesis include the US, UK and French governments. H 2 states that a false-flag chemical attack was carried out by the Syrian opposition, with the objective of bringing about a US-led attack on the Syrian armed forces. A leading proponent of this hypothesis was the blogger "sasa wawa", who set up a crowd-sourced investigation of the Ghouta incident. The evidence generated during this investigation was later set out in the framework of probability calculus by the Rootclaim project, founded by Saar Wilf, an Israeli entrepreneur (and noted international poker player) with a background in the signals intelligence agency Unit 8200. I think we can tentatively identify "sasa wawa", who seemed "to have unlimited time and energy and to be some sort of polymath", as Wilf.

we would have expected the opposition to use any munition available to them that would implicate the Syrian army

Comparison with Rootclaim's evaluation of the weight of evidence contributed by the rockets

Evidence contributed by the non-occurrence of an expected event

H 1 : a chemical attack was carried out by the Syrian military, authorized by the government

: a chemical attack was carried out by the Syrian military, authorized by the government H 2 : a false-flag chemical attack was carried out by the Syrian opposition to implicate the government

: a false-flag chemical attack was carried out by the Syrian opposition to implicate the government H 3 : an unauthorized chemical attack was carried out by a rogue element in the Syrian military

: an unauthorized chemical attack was carried out by a rogue element in the Syrian military H 4 : there was no chemical attack but a managed massacre of captives, with rockets and sarin used to create a trail of forensic evidence that would implicate the Syrian government in a chemical attack.

Paul McKeigue:

Fake news, false flags and the weight of evidence favouring alternative explanations of alleged chemical attacks in Syria

you cannot evaluate the evidence for or against a single hypothesis, only the weight of evidence favouring one hypothesis over an alternative

the weight of evidence favouring one hypothesis over another is based on comparing how well each of the two hypotheses would have predicted the observations

your assessment of how well a hypothesis would have predicted the observations does not in general depend on your prior degree of belief that this hypothesis is true

How well a hypothesis would have predicted the observations is quantified by a number called the likelihood. This is calculated as the probability of the observations given that hypothesis. When the observations are fixed and we are comparing different hypotheses, we reverse this dependency and describe this number as "the likelihood of the hypothesis given the observations". If you find this confusing, you're not the only one. Likelihoods are not probabilities when they are used to compare hypotheses. "Support" would be a better word than "likelihood" (which in ordinary English is synonymous with probability).

The weight of evidence favouring one hypothesis over another is the logarithm of the ratio of the likelihoods. Weights of evidence can be added over independent observations. It's convenient to use logarithms to base 2, so that the weights are expressed in bits.

If you make an assertion about the strength of evidence favouring one hypothesis over another, you are making an assertion about the conditional probabilities from which the ratio of likelihoods is calculated. These conditional probabilities ("expectations" would be a better word than "probabilities") are based on subjective judgements. You can't evaluate evidence without making these subjective judgements.

relevant observations have been widely ignored (as we shall see in this section)

(as we shall see in this section) observations consistent with a hypothesis have been accepted as evidence supporting that hypothesis, without considering alternative hypotheses. An example is how the observation of Volcano rockets in the Ghouta incident was accepted as supporting the hypothesis of a regime attack, though the hypothesis of a "false flag" attack would have predicted this observation at least as well.

with a hypothesis have been accepted as evidence that hypothesis, without considering alternative hypotheses. An example is how the observation of Volcano rockets in the Ghouta incident was accepted as supporting the hypothesis of a regime attack, though the hypothesis of a "false flag" attack would have predicted this observation at least as well. the evaluation of evidence favouring one hypothesis over another has been been confused with assertions of prior belief about the plausibility of one of those hypotheses.

Weight of evidence for alternative hypotheses about the alleged chemical attack in Ghouta in 2013

the observation that all victims were in day clothes though the alleged attack occurred at about 2 am

the obviously fraudulent videos of the "Zamalka Ghost House" in which videos of a group of adults and children apparently executed several days before the alleged chemical attack and placed in an unfinished building were presented as a family of victims found in situ.

The Kafr Batna morgue images

Evidence for alternative explanations of Khan Sheikhoun

H 1 : the Khan Sheikhoun incident was a chemical attack by the Syrian air force using sarin. The leading proponents of this hypothesis are the US, UK and French governments.

: the Khan Sheikhoun incident was a chemical attack by the Syrian air force using sarin. The leading proponents of this hypothesis are the US, UK and French governments. H 2 : the Khan Sheikhoun incident was a planned deception operation intended to bring about US military intervention, in which captives were killed in gas chambers, small quantities of sarin were used to generate a forensic trail and a large-scale media operation was undertaken to support the story of a chemical attack by the Syrian air force. The earliest proponents of this hypothesis were a group of contributors to the wiki A Closer Look on Syria. Under this hypothesis, Khan Sheikhoun is Ghouta version 2, and it is to be expected that a similar trail of evidence will be laid: purported eyewitnesses will describe the attack, videos will show victims purportedly being treated and bodies laid out in morgues, at least one alleged impact site will be shown with the remains of a munition, and both environmental and physiological samples will test positive for sarin.

Captives (most likely religious minorities or families of government supporters) would be held in readiness. Improvised explosive devices and possibly smoke generators could be placed at key locations in the town to panic the civilian population into believing they were under chemical attack. Low doses of sarin could be administered to volunteers so that they would test positive for exposure to sarin (the doses required to generate a positive test are far below those required to cause symptoms). Medical facilities controlled by jihadis would be ready to play their part by showing casualties, real or fake, being "treated". A few actors could be prepared to play the part of bereaved parents, and provided with photos of children who were to be killed. Captives would be killed in improvised gas chambers, but the preferred agent would be an easily-available gas that leaves no residue, rather than sarin which would endanger those removing the bodies. A well-staffed video editing operation would be ready to edit the raw footage into clips and stills badged with the logos of various opposition media organizations. To make the video images so horrific that those viewing them would be shocked into supporting immediate retaliation against the Syrian government, the planners might choose that some children would not be killed outright by the gas but instead filmed struggling to breathe, before they were finished off by other methods.

if we were presented with convincing and hard-to-fake evidence that the victims seen dead in the images had lived in the locality from which they were supposedly rescued if interviews with bereaved survivors included convincing and hard-to-fake evidence that the dead victims were their relatives, including family photos showing them with these victims. These family photos should include adult victims, who unlike young children cannot easily be induced to pose in a familiar setting with their captors. if videos showed the search and rescue operations in which these victims' bodies were recovered: these operations would be hard to stage on a large scale without the cooperation of civilians. if a chemical signature match between the environmental sarin samples and Syrian military stocks were reported by scientists prepared to put their names on a report that was detailed enough to be subjected to independent peer review. if blood tests on purported survivors of the chemical attack showed exposure to sarin at levels high enough to have caused severe and life-threatening poisoning. Modern tests for sarin exposure can detect exposure at levels far lower than those required to cause symptoms. It would be easy for actors to expose themselves to low doses of sarin, but not so easy for them to expose themselves at levels high enough to cause severe symptoms.

if the locations of victims and alleged air strikes were not consistent with records of flight tracks or with wind directions. Under H 2 , locations of improvised explosive devices would have to be planned in advance, without knowing where a jet would fly or which way the wind would be blowing. if the uploaded videos contained evidence that scenes were staged or that the victims were captives. Under H2, a weak point in the operation is that dozens of video clips and still images that are meant to show rescue workers dealing with large numbers of victims have to be recorded, edited and uploaded in a few hours, and the editing may fail to remove incriminating material. When all available images are arranged in temporal sequence, using sun angles and other clues to time the images, and the identities of victims are matched in different clips a different story may be revealed, as in Kafr Batna.

Weights of evidence contributed by observations

Observation Prob (obs given H 1 Prob (obs given H 2 ) Likelihood ratio H 2 / H 1 Weight of evidence (bits) favouring H 2 over H 1 An individual claiming to be a bereaved survivor was made available for interview, with photos showing him with two children later seen as victims. The lack of photos of his wife was attributed to loss of the family photo album in an airstrike on the family home. 0.002 0.04 20 4.3 There are no videos of victims being rescued in their homes, or bodies being recovered 0.05 0.8 16 4 The flight track of the Syrian jet shown by the Pentagon (single east-west pass just south of the town) is incompatible with the track of the three explosions (north-south axis over the northern part of town) and the alleged impact site of the chemical munition 0.01 0.8 80 6.3 The alleged impact site of the chemical munition is upwind of where the casualties were reported (by the rebels) to have occurred. 0.02 0.5 25 4.6 In the images released by the rebels, several of the children who are seen dead have head and neck injuries. Reconstruction of sequences and matching of identities shows that in two of these children the head injuries were received after they had been supposedly rescued by the White Helmets 0.01 0.2 20 4.3 Total 23.05

Notes on assignment of likelihoods

In Khan Sheikhoun at least two individuals claiming to be bereaved survivors were interviewed. Most of the interviews were given by Abdelhamid al-Yousef (AHY), who appears to have been serving in the opposition forces as a sniper. AHY reported that his wife and nine-month old twins had been killed in the chemical attack, and produced photos showing him with two children about this age who were among the dead victims. No photos showing AHY with the mother of these children were produced: an interviewer reported that "he does not even have any photos of his beloved wife of two years left to console him, as they were all destroyed in the attack that ripped through his hometown." and quoted him as saying "In my house all the photos I had of my wife and everything I owned was burnt." Under H 1 , it is expected that at least one bereaved survivor would be available for interview. However the probability is rather low that the witness's home would have been destroyed in an air strike at the same time as the alleged chemical attack, given that only three explosions were documented as occurring in Khan Sheikhoun at this time. These explosions were geolocated by smoke plumes, satellite images and ground-based images. The explosions appear to have been relatively small, each destroying only a single house. If, as alleged, these explosions were caused by bombs dropped by an aircraft in a single pass over the northern half of town, we can estimate the area at risk as about 30 hectares, and that about 1500 homes were at risk (based on a typical urban density of 50 homes/hectare). The probability that the witness's home would have been one of the buildings destroyed by these three explosions is therefore about 1 in 500.Under hypothesis H 2 that Khan Sheikhoun was version two of Ghouta, there is a moderate probability that at least one actor would have been prepared to play the part of a bereaved survivor, and would have posed for photographs with captive children. I'll assign a probability of 0.2 to this. The problem for such an actor would be to explain the lack of photographs showing him with the adult victims from the same family. It is much easier to get young children to play happily with an adult who befriends them than it is to induce adults to pose for a family photograph with their captors. Of the possible explanations that such an actor might choose to give, one of the most likely (to emphasize the brutality of the regime) is that the family home was destroyed in an airstrike. I'll assign a probability of 0.2 that this explanation would be produced. Multiplying the conditional probability under H 2 that an actor with photos showing him with the children would be made available for interview by the probability that this actor would invoke destruction of the family photo album in an airstrike to explain the lack of photos showing him with the mother, we get a likelihood of 0.04.The likelihood ratio favouring H 2 over H 1 is 20. Note that this assessment of likelihoods does not make any assessment of whether AHY is telling the truth or lying. We have shown that under H 1 , it is a rather improbable coincidence that one of the few homes destroyed by three apparently untargeted bombs dropped on a town of at least 20,000 people would be that of the sole survivor of a large extended family killed in a chemical attack at the same time. We also assess that under H 2 , it is quite probable that an actor playing the part of a bereaved survivor would report the destruction of his home in an airstrike as an explanation for why no family photos showing him with adult victims were available. Computing the ratio of these two likelihoods allows us to make a statement about the strength of the evidence contributed by this observation. In all the videos and images released by the White Helmets and other opposition media organizations from Khan Sheikhoun, there are no images of urban search and rescue operations. Under H 1 , we'd expect to see videos of the White Helmets carrying out a search and rescue operation covering the neighbourhood allegedly affected by the chemical attack. The White Helmets are trained in urban search and rescue procedures and are famous for documenting their operations on video. The absence of such videos has low probability (conservatively assessed at 0.05) under H 1 , but high probability (0.8) under H 2 as it would be difficult to stage such scenes without involving large numbers of civilians. The flight track of the Syrian jet shown at the Pentagon's press conference shows only a single east-west pass just south of the town, passing no closer than 2 km from the crater that was the alleged impact site of the chemical munition. The three high explosive detonations, mapped by OPCW based on witness reports, and by others based on geolocation of smoke plumes and images (satellite and ground-based) of explosive damage, are in the northern half of town in a north-south line. From the scatter of the points that were plotted on the Pentagon's map, we can estimate the accuracy of the flight track (presumably based on airborne radar). By inspection of other east-west passes on this map, I estimate that the standard deviation of the errors in a north-south direction is less than 1 km. For the jet to have passed over the alleged impact site, at least two data points would have had to have been plotted too far south by at least two standard deviations: the probability of this is about 1 in 1000. Even more unexpected under H 1 is that the flight track does not show the north-south pass that would have been required to drop three bombs corresponding to the three documented high-explosive detonations. As the Pentagon's map appears to include at least one false-positive data point (an outlying data point southwest of Homs city that does not appear to be part of a flight track), it is reasonable to allocate a small but nonzero probability to false-negative results: specifically a failure to detect a north-south pass. To be conservative, I'll assign a value of 0.01 to the probability under H 1 that the Pentagon's map of the track of the Syrian jet would match neither the position nor the alignment of the reported impact sites. Under H 2 , the explosions were generated by IEDs, and the arrival of the jet was the cue to set off these explosions. The probability that the pre-planned line-up of IEDS would not match the flight path of the jet is high - I assign a probability of 0.8 to this. The videos of the smoke plumes from the three high explosive detonations, recorded by opposition cameramen and said to have occurred just before the alleged chemical attack, show that the wind was blowing steadily from southwest to northeast. The OPCW's map of the area in which casualties allegedly occurred, based on reports from eyewitnesses, shows that this area is southwest - i.e. upwind - of the alleged impact crater. Under H 1 , this is difficult to explain: we have to postulate some unusual local reversal of wind direction at ground level. I assign a probability of 0.02 to this. Under H 2 , in which the locations from which casualties were to be reported and the location of the impact crater were planned in advance, the probabilities that the specified casualty location would be upwind or downwind of the impact crater are about equal, so the probability of an upwind location is about 0.5. The images of the victims are so horrific that most of us find it difficult to look at them further. Detailed frame-by-frame analyses of the many videos clips and still images can take many months. A few citizen journalists in different countries, sharing their work for peer review, have made some progress with this. Careful examination of the videos and still images, using sun angles to time them, has allowed them to be ordered in temporal sequence and the identities of the same individuals to be matched in different videos. Several of the children seen dead in improvised morgues have obvious and recent head injuries. In at least two of these children, it is possible to establish that these head injuries were received after they had been "rescued" by the White Helmets. Under H 1 , the probability that at least two victims would receive traumatic injuries after they had been rescued is very low. The most plausible explanation under H 1 is a traffic accident while they were being transported in an ambulance or a pickup truck. A rough estimate for the rate of serious injuries from road traffic accidents in a low-income country like Syria in wartime is 1 per million vehicle-kilometres. Allowing for a tenfold higher rate per vehicle-km in vehicles used as emergency ambulances, and a total distance of 200 vehicle-km travelled by vehicles transporting casualties in the Khan Sheikhoun incident, the probability of an accident causing injuries to some of these casualties is about 0.002. Note that this is the risk of a single accident that is assumed to account for all injuries received after rescue; if the injured children did not travel in the same ambulance, we have to postulate multiple accidents, for which the probability is far lower. Again to be conservative I'll assign a conditional probability of 0.01 to these injuries occurring by accident under H 1 . Under H 2 , it is probable that some victims would survive the gas, either by accident or by design (if the plan was to film some children while still alive for maximal emotional impact). These victims would have to be finished off with physical violence, and the probability is high that this would include blows to the head or neck. The probability that editing of the videos would fail to remove the incriminating sequence of images is also moderately high, given the large number of videos that had to be edited and uploaded over a few hours. I assign a probability of 0.2 to this observation given H 2 .

This might be described as a mountain of evidence

By fitting smoothed curves to the points shown on the Pentagon's map of the flight track, it should be possible to make a better estimate of the probability distribution of the errors in the data points that make up the flight track.

Someone with meteorological expertise may be able to assign a more realistic probability of a local reversal of wind direction at ground level.

Further analysis of the videos may establish whether a single traffic accident to an ambulance can account for all children who were injured after they had been rescued.

Conclusion