The More Democratic, And The More Human Rights, The Less Terrorism

January 9, 2009

[First published March 15, 2006] So far, there is considerable empirical support for the argument that promoting global freedom, if successful, will make the world generally more peaceful, and possibly end war and democide. However, there has been little empirical work that bears specifically on terrorism in the context of the democratic peace. So, I will do that here.

A relevant scale for doing this is the Purdue Political Terror Scale (PTS) shown below. It attempts to measure the degree to which governments terrorize their citizens and deprive them of human rights.

Mark Gibney and Mathew Dalton developed the Political Terror Scale. An article on it, plus “Political Terror Scale Notes” and the actual scoring on it each year for all nations, 1980-2004 is available on Gibney’s personal website . He is Belk Distinguished Professor 
and Professor of Political Science at the University of North Carolina Ashville.

I know, I know, this is not the terrorism that is focused on today, which is that of small groups of terrorists, their murder and genocide bombing, and their insurrections. But, behind it all are level 4 or 5 PTS states, such as Iran, Syria, and North Korea, as were Fatah’s Palestine Authority, Saddam’s Iraq, and the Taliban’s Afghanistan. Democratize these states and individual and group terrorism will dry up for want of resources and bases, at least as implied by the Forward Strategy of Freedom. But, we’ll see.

The PTS scale values for all nations were coded from the annual Amnesty International (AI) and United States State Department (State) Country Reports on Human Rights. Because of these sources, Oona A. Hathaway and Daniel E. Ho used the PTS scale for “Characterizing Measurement Error in Human Rights.” (CME — in pdf): They say:

We illustrate a method for accounting for measurement error in human rights studies — an area of research plagued by difficulties of measuring concepts that cannot be directly observed. We focus on the widely used Purdue Political Terror Scales (PTS), which quantify political terror experienced in a country based on independent qualitative narrative reports compiled by the United States Department of State and Amnesty International. A simple Bayesian measurement model systematically incorporates these two independent codings and directly models the uncertainty of a latent measure of political terror. This reveals that attenuation bias due to lagged PTS estimates can be severe, leading conventional estimates to be conservatively biased by an absolute order of roughly two. Substantively, this means that explanatory variables such as democracy may have roughly twice the impact on human rights as currently believed. We conclude that measurement methods illustrated here hold much promise for addressing concerns about measurement error in empirical scholarship. [Bold italics added]

As to the two politically antagonistic sources — State and AI — the above CME report finds that the correlation between them across all the countries in their report is .83, which means that in their reporting of human rights these two sources are at variance across 31 percent of the data (1-correlation squared).

CME shows the variation of these two sources in the chart below:

Now, my empirical question is this: How well does the degree of liberal democracy of a nation predict its scale level on the PTS, which is to say, terrorism and lack of human rights. I took the PTS values for 2004 and the Freedom House freedom ratings for the same year on both civil liberties and political rights, where the lower the average rating on both, the more liberal democratic a nation. Then, I did a bivariate regression, and found that the degree of freedom predicted 32% of the variation in terror/human rights (R squared = .32, a very conservative finding, given the CME conclusion about democracy and human rights given above). That is, the more liberal democratic a nation, the less its government terror and the more its government respects human rights [PTS = 1.51-27(Freedom rating); signs on both scales reversed].

Since the relationship may not be linear, I should note that the analysis of variance is very good (F-stat = 81.6, p <.0001) In my next post, I’m going to explain what this sometimes mysterious “p” that appears in so many quantitative studies means, and its pitfalls. Just take my word today that “p” here is not a sampling probability (it cannot be since I am dealing with all countries and not a sample in any meaningful statistical sense), but a combinatorial one.

Anyway, my plot of the two is shown below, where -HR is the reversed PTS, and the X-axis is the reversed freedom ratings. For -HR, 1 is the most terroristic nations with the least human rights, and 5 reverses this; for the -FREE average ratings, 1 is the least free, 7 the most. Thus, as one moves to the right on the X-axis and up on HR (PTS) Y-axis, the greater the freedom, and the less the terror and the more respect for human rights.

Obviously, there is considerable variation around a trend of decreasing terror/increasing human rights as freedom increases. To see this, I averaged the PTS scores for each freedom rating. I show the result in the plot below, where the axes are the same as above (sorry, the X-axis label is cut off).

The bottom line should be clear. To eliminate the terrorism of governments against their people and guarantee their human rights, foster democratic freedom.And this is now the American foreign policy, which judging by all the empirical analyses that support it, is one of realistic idealism.

On That Mysterious “p” In Quantitative Reports

January 9, 2009

[First published March 15, 2006] Everyone who does a lot of reading of reports, studies, and the like is bound to run across “p < .05” or some other fraction, such as “p <.0005”. Or, instead, they will read that, “the results are significant,” or “not significant.” What is going on here? I will try to explain this, nontechnically, without all the details beloved of the statistician (no Type I and Type II error, no two-sided test versus one-sided, no normal distribution, no equations, etc.), and even oversimplify for the purpose of clarity.

To begin, p stands for “probability.” Thus, p < .05 means that the probability is less than .05. For example, if there are 100 balls in a basket and 4 of them are red, the probability of blindly selecting a red ball is p < .05, or less than a chance of getting it once in over 20 tries, or 5 times in over 100 tries . But, this understanding by itself can be misleading if a sample of some sort was analyzed.

For example, assume that a randomly selected sample of some sort has been analyzed, as of 100 American college students in order to determine for the population (universe) of all American college students the correlation between getting drunk at least once a month and grades. Let us say the correlation between such drunkenness and grades is .17, p < .05. How to interpret this? Not as a straight forward probability of getting .17.

Rather, the idea is that one has implicitly tested what is called the null hypothesis that the true correlation for all American students is r = 0, which means that hypothetically there is no correlation between getting drunk at least once a month and grades. The p <.05 then means that if one rejects the null hypothesis and accepts that .17 is true for the population of students, the probability of this choice being in error is 1 out of more than 20. Although in research on samples, the null hypothesis is usually not stated, it is there nonetheless (some classes in statistics require students to always state the null hypothesis). Regardless of whether the statistic being applied is a t-test, F-ratio, chi-square, or some other, the implicit assumption usually is that for the population the sample represents, the true statistic is zero. Then, the p indicates the chance of error if this hypothetical value for the population is rejected in favor of accepting the one actually found for the sample.

As another example, in a regression analysis on a sample, the resulting regression coefficients may be given with associated t-tests and p-values. Assume, for example, a regression coefficient is 3.4 with a t-test of 2.0 and p < .03. The assumed null hypothesis is that the regression coefficient for the universe the sample represents really is 0, and if this is rejected in favor of the finding that it is 3.4 for the population, the chance of error in doing this is less than .03. That is, if this study was replicated over 100 times, it is likely that in 3 of them the regression coefficient would be 0.

I have made the null hypotheses equal to 0, which is generally the case. But, it can equal any number. Regardless, the question is still answered by p as to how probably the research will be in error if it rejects the number given for the population in the null hypothesis in favor of the number found by the research.

When a null hypothesis is rejected with little chance of error, the result is called significant. The acceptable probability of error — significance — among scientists is a matter of tradition, which is that if the chance of error is p equal or less than .05, the result is significant. This is a convention, however, and a researcher may be conservative about error and define significance in his research as p < .01. Or, if the researcher believes there is much random error in his data, he may than raise the significance level to something like p < .1. In other words, when a study says its correlation is significant, it is saying in effect that its correlation is such that there is little chance of error in rejecting the possibility that it is zero (or some other number). If a study says it has conducted a significance test, it is saying that it calculated the p-value; and if it says the result was nonsignificant, it means that the chance of error in rejecting the null hypothesis was too great. But, without knowing the p values, there is no way of knowing what chance of error the researcher found acceptable or unacceptable.


The danger in significance tests is that the p-value is completely dependent on the sample size (N). See the change in significance (p-value) for the very low correlation of .15 at different sample sizes:

N = 10, p = .34
N = 50, p = .15
N = 100, p = .07
N = 500, p = .0004

All one needs to do, it seems, is to increase the sample size to get very significant results, although totally meaningless ones. What does this mean? To understand what the correlation coefficient means for the relationship between two variables, for example, square it and multiple by 100. The result will be the percent of variance (variation) in common, or shared between the two variables. So, if one does this for r = .50 (to make this easy), the result is 25%. To say that two variables have 25% of their variation in common is a lot more meaningful then saying that their correlation is .5. Thus, an r = .80 means 64% of the variation is in common; r = .90 means 81% in common, and so on. This is a way of getting at the true empirical meaning of a correlation, and one that is not dependent on sample size and the significance test.

And this displays a major problem with significance tests. Consider the correlation of .01 for a sample of 500 people, and a significant p = .041. By convention, this itty-bitty correlation is significant, and the unwary researcher might so report it. But, it is meaningless. For the variation in common between the two variables is an incredibly low .01%, or virtually a zero relationship. And yet, it is significant! Always consider the variance in common along with the significance test.

Another problem is that the null hypothesis and its significance test assume that the analysis is carried out on a sample selected in some appropriate way to reflect a population. But, the analysis may be of all nations, all American senators, all students at Yale, and so on. There is no sample. One might say, however, as some researchers have tried to do, that this is a sample of all nations, senators, or Yale students that have existed, or will exist. But, then the problem is that the sample is in no way a randomly selected representation of this population, which violates an assumption of the significance test.

So, the usual significance tests are inappropriate when analyzing a whole population. If, for example, r = .60 for the relationship between development and literacy for all 192 nations in 2005, then there can be no null hypothesis, since this correlation is truly .60 for all nations. Yet, as some readers may have noticed, I have p-values scattered throughout my research even though I am analyzing all nations.

There is another way to look at probability, then for samples. One can, for example, calculate the probability of tossing a coin and getting five heads in a row; of not getting a seven in ten tosses of the dice; and of none of the 122 democracies among 192 nations having had any of the conflicts in that year among themselves. Such probabilities can also be given as p-values. The p of getting heads in a row is the probability of getting one head in one toss ( = .5) to the third power, which is p = .125, or p < .50. This would then be significant.

So, in my post yesterday, my analysis of variance of the relationship between the terrorism/human rights scale and freedom was an F-statistic of 81.6, p < .0001.This is saying that for all the data on the two variables, the chance that they would up for all nations such that one would get the F-statistic is <.00001 — almost a 0 probability, and thus very significant. There must be something causing such a near impossible pairing, and I say that it is the democratic nature of a regime.

Thus, in the case of analyzing the whole population, instead of its representative sample, the p now is the chance of getting any statistic, such as a correlation, multiple correlation, regression coefficient, chi-square, and so on, just by chance.

Related Links

“Statistical Significance”:

“Significance level” is a misleading term that many researchers do not fully understand. This article may help you understand the concept of statistical significance and the meaning of the numbers produced by The Survey System.

“Statistical Significance”:

What does “statistical significance” really mean?

Elementary Concepts in Statistics

Understanding Correlation