SOC2069 Quantitative Methods
  • Materials
  • Data
  • Canvas

Codebook

Wilkinson & Pickett (2009)

Variable Description Type N (%)
Missing
N (range)
Levels
Country Country character 0 (0.0%) 23 (Australia⋯USA)
Income inequality Income inequality numeric 0 (0.0%) 23 (3.4 : 9.7)
Trust Trust numeric 0 (0.0%) 23 (10 : 66.5)
Life expectancy Life expectancy numeric 0 (0.0%) 18 (76.2 : 81.6)
Infant mortality Infant mortality numeric 0 (0.0%) 16 (2.9 : 6.9)
Homicides Homicides numeric 0 (0.0%) 22 (5.2 : 64)
Imprisonment (log) Imprisonment (log) numeric 0 (0.0%) 21 (3.3 : 6.4)

Pickett et al. (2024)

Variable Description Type N (%)
Missing
N (range)
Levels
Country Country character 0 (0.0%) 43 (Australia⋯United States)
Trust % people who agree that “most people can be trusted” numeric 0 (0.0%) 43 (6.7 : 77.4)
Income_inequality_Gini Gini coefficient (disposable income) numeric 0 (0.0%) 43 (0.2 : 0.5)
Income_inequality_S80S20 Quintile share ratio (disposable income) numeric 0 (0.0%) 41 (3.4 : 28.3)

Data sources and definitions

Income inequality

Data on income inequality in various countries comes from the OECD Income Distribution Database (IDD). The IDD offers data on levels and trends in income inequality and poverty, and is updated on a rolling basis. The latest update at the time of writing was in June 2025. You can download a summary table with key indicators such as Gini coefficients, income share, quintile share ratios and poverty rates for selected years from here. An interactive charting tool of income inequality indicators is also available here. Detailed definitions and technical descriptions of their operationalisation is provided here. You can read a summary of the database and core inequality indicators in the drop-down box below.

OECD data sources and definitions

Equivalised household disposable income

The OECD Income Distribution database (IDD) benchmarks and monitors countries’ performance in the field of income inequality and poverty. It contains a number of standardised indicators based on the central concept of “equivalised household disposable income”, i.e. the total income received by the households less the current taxes and transfers they pay, adjusted for household size with an equivalence scale. While household income is only one of the factors shaping people’s economic well-being, it is also the one for which comparable data for all OECD countries are most common. Income distribution has a long-standing tradition among household-level statistics, with regular data collections going back to the 1980s (and sometimes earlier) in many OECD countries.

Achieving comparability in this field is a challenge, as national practices differ widely in terms of concepts, measures, and statistical sources. In order to maximise international comparability as well as inter-temporal consistency of data, the IDD data collection and compilation process is based on a common set of statistical conventions (e.g. on income concepts and components). The information obtained by the OECD through a network of national data providers, via a standardised questionnaire, is based on national sources that are deemed to be most representative for each country. The original data sources for each country and year are listed here.

Quintile share ratio (disposable income)

The quintile share ratio - or S80/S20 ratio - is the share of all income received by the top quintile (i.e. 20%) divided by the share of the bottom quintile (i.e. 20%). It is a relatively simple measure to calculate once we have accurate data on the income distribution in a country. For example, in Table 1, we find that in Australia in the latest available year earners in the bottom 20% of the income distribution held 7.2% of the total income distributed in that country, while those in the top 20% of the distribution received 40% of that total. If we divide the two (\(\frac{40}{7.2}\)) we obtain 5.6, which is the value given for the S80/S20 income share ratio indicator for Australia in the latest available year (2020).

The quintile share ratio scale, while it is simple to calculate and understand, also has some statistical limitations. The minimum value that the scale can take is 1 (because \(\frac{10}{10}\) = \(\frac{40}{40}\) = \(\frac{50}{50}\) = \(1\) ), which would describe a perfectly equal society where the richest 20% of the population receives the same amount of total income as the poorest 20%. The maximum value, however, is theoretically infinite (because the measurement is a ratio, and e.g. \(\frac{90}{1}\) = 90, but \(\frac{99.9}{0.03}\) = 3330 and \(\frac{99.999}{0.00005}\) =1.99998^{6}), and the higher the value the more unequal a society is, with the total income of the poorest 20% of the population tending to 0. The larger the gap, the more extreme the ratio value.

However, the above scenarios are empirically unrealistic. In fact, if we look at actual data in the table below, the lowest quintile share ratio value is 3.1 (Slovakia) and the highest is 33.1 (South Africa). On this more restricted empirical scale (running between around 3 to 34), the difference between the values is more equal, making the changes from one value to another less extreme.

Yet, since the value is a ratio, the change between values is not “linear”. This means that the impact of a one-unit change in the ratio is not consistent across the entire scale: the relative impact or percentage increase in inequality associated with a one-unit change in the ratio decreases as the ratio gets larger. For mathematical and statistical purposes, particularly when researchers need to compare changes in inequality across different populations or over time, a non-linear scale like the S80/S20 can be problematic. To handle this non-linearity, researchers often use a logarithmic transformation of the ratio in their models, which effectively linearizes the scale and makes the interpretation of coefficients consistent across the entire range of values.

Gini (disposable income)

The Gini coefficient is a measure that compares cumulative proportions of the population against the cumulative proportions of income they receive, condensing the entire income distribution for a country into a single number between 0 and 1: the higher the number, the greater the degree of income inequality. Mathematically, there are a few different equations that economists commonly use for calculating it, with those based on the so-called Lorenz curve being the most popular. You can watch a very short video explanation of the Gini coefficient here and there is a nice and simple online Gini Coefficient Calculator available here. For example, look at the two sets of numbers below, which we can imagine to represent the incomes of ten individuals each making up the population of a different country (and assume some standardised hypothetical currency unit that equalises purchasing power differences in the two countries)1:

  • Country A: 8000 10000 9000 10000 10000 8000 7000 8000 390000 540000
  • Country B: 80000 100000 90000 120000 140000 70000 70000 90000 120000 120000

Eyeballing the numbers, which “country” do you think is the more equal one, and which is the more unequal one? Which one will have the higher Gini coefficient? You can copy/paste each set of numbers into the online calculator to get the precise coefficient. But the true reason why we may care about these artificial metrics is that it allows us to ask some further questions, such as: which individual would you most like to be from among the twenty income holders listed?; which country would you rather like to live in?; if you were to be randomly assigned at birth to a country in a world consisting of several countries such as these two, would you be more comfortable if that world consisted predominantly of countries of type “A” or “B”? Philosophers have been asking these questions - sometimes very explicitly - for a long time, economists have been working on designing more detailed and accurate measurements and modelling techniques, and sociologists are always interested in understanding how these questions shape the actual lives of people.

1 Many internationally comparative economic indicators rely on such standardised units as Purchasing Power Parity (PPP) rates, international dollars, Purchasing Power Standards (PPS) or the ‘Big Mac Index’

Table 1: Comparative summary of key inequality indicators
(OECD Income Distribution database (IDD), 2022 or most recent year)
Source: OECD-IDD
Country Gini coefficient S80/S20 income share ratio Income share: Bottom 20% Income share: Top 20%
Australia 0.32 5.6 7.2 40.0
Austria 0.29 4.4 8.4 37.0
Belgium 0.25 3.6 9.7 34.5
Canada 0.31 5.0 7.6 38.2
Chile 0.45 10.1 5.0 50.9
Costa Rica 0.47 12.3 4.2 52.0
Czechia 0.25 3.5 9.8 34.7
Denmark 0.28 4.0 9.3 36.8
Estonia 0.32 5.6 7.0 38.8
Finland 0.27 3.9 9.2 36.1
France 0.30 4.5 8.6 38.5
Germany 0.31 5.1 7.7 39.2
Greece 0.32 5.2 7.6 39.3
Hungary 0.29 4.7 8.0 37.5
Iceland 0.25 3.5 10.0 35.0
Ireland 0.28 4.2 9.0 37.5
Israel 0.34 6.3 6.4 40.3
Italy 0.32 5.4 7.3 39.3
Japan 0.34 6.4 6.3 40.2
Korea 0.32 5.8 6.8 39.2
Latvia 0.34 6.2 6.5 40.6
Lithuania 0.36 6.5 6.5 42.5
Luxembourg 0.30 4.5 8.4 37.8
Mexico 0.40 7.8 6.0 46.5
Netherlands 0.29 4.3 8.7 37.8
New Zealand 0.33 5.5 7.2 39.7
Norway 0.26 4.0 8.9 35.3
Poland 0.27 4.1 8.8 35.9
Portugal 0.33 5.5 7.5 40.9
Slovak Republic 0.23 3.5 9.2 32.0
Slovenia 0.24 3.5 9.6 33.9
Spain 0.32 5.5 7.1 38.7
Sweden 0.29 4.3 8.8 37.6
Switzerland 0.32 5.0 7.9 39.7
Türkiye 0.43 8.0 6.1 49.3
United Kingdom 0.37 6.7 6.5 43.5
United States 0.39 8.5 5.3 45.0
Brazil 0.45 11.2 4.5 50.6
Bulgaria 0.37 6.7 6.6 44.0
China 0.51 28.3 1.9 53.5
Croatia 0.30 5.0 7.4 37.2
India 0.49 13.4 4.1 54.6
Romania 0.31 5.8 6.5 37.4
Russian Federation 0.32 5.1 7.7 39.5
South Africa 0.62 32.4 2.0 65.8