Updated Circular Plots for Directional Bilateral Migration Data

I have had a few emails recently regarding plots from my new working paper on global migration flows, which has received some media coverage here, here and here. The plots were created using Zuguang Gu’s excellent circlize package and are a modified version of those discussed in an earlier blog post. In particular, I have made four changes:

  1. I have added arrow heads to better indicate the direction of flows, following the example in Scientific American.
  2. I have reorganized the sectors on the outside of the circle so that in each the outflows are plotted first (largest to smallest) followed by the inflows (again, in size order). I prefer this new layout (previously the inflows were plotted first) as it allows the time sequencing of migration events (a migrant has to leave before they can arrive) to match up with the natural tendency for most to read from left to right.
  3. I have cut out the white spaces that detached the chords from the outer sector. To my eye, this alteration helps indicate the direction of the flow and gives a cleaner look.
  4. I have kept the smallest flows in the plot, but plotted their chords last, so that the focus is maintained on the largest flows. Previously smaller flows were dropped according to an arbitrary cut off, which meant that the sector pieces on the outside of the circle no longer represented the total of the inflows and outflows.

Combined, these four modifications have helped me when presenting the results at recent conferences, reducing the time I need to spend explaining the plots and avoiding some of the confusion that occasionally occurred with the direction of the migration flows.

If you would like to replicate one of these plot, you can do so using estimates of the minimum migrant transition flows for the 2010-15 period and the demo R script in my migest package;

# install.packages("migest")
# install.packages("circlize")
library("migest")
demo(cfplot_reg2, package = "migest", ask = FALSE)

which will give the following output:

Estimated Global Migration Flows 2010-15

The code in the demo script uses the chordDiagram function, based on a recent update to the circlize package (0.3.7). Most likely you will need to either update or install the package (uncomment the install.packages lines in the code above).

If you want to view the R script in detail to see which arguments I used, then take a look at the demo file on GitHub here. I provide some comments (in the script, below the function) to explain each of the argument values.

Save and view a PDF version of the plot (which looks much better than what comes up in my non-square RStudio plot pane) using:

dev.copy2pdf(file ="cfplot_reg2.pdf", height=10, width=10)
file.show("cfplot_reg2.pdf")

Circular Migration Flow Plots in R

Please see this blog post on updated version of circular plots for migration flows, based on global estimates for 2010-15.

A article of mine was published in Science today. It introduces estimates for bilateral global migration flows between all countries. The underlying methodology is based on the conditional maximisation routine in my Demographic Research paper. However, I tweaked the demographic accounting which ensures the net migration in the estimated migration flow tables matches very closely to the net migration figures from the United Nations.

My co-author, Nikola Sander, developed some circular plots for the paper based on circos in perl. A couple of months back, after the paper was already in the submission process, I figured out how to replicate these plots in R using the circlize package. Zuguang Gu, the circlize package developer was very helpful, responding quickly (and with examples) to my emails.

To demonstrate, I have put two demo files in my migest R package. For the estimates of flows by regions, users can hopefully replicate the plots (so long as the circlize and plyr packages are installed) using:

library("migest")
demo(cfplot_reg, package = "migest", ask = FALSE)

It should result in the following plot:
cfplot_reg
The basic idea of the plot is to show simultaneously the relative size of estimated flows between regions. The origins and destinations of migrants are represented by the circle’s segments, where nearby regions are positioned close to each other. The size of the estimated flow is indicated by the width of the link at its bases and can be read using the tick marks (in millions) on the outside of the circle’s segments. The direction of the flow is encoded both by the origin colour and by the gap between link and circle segment at the destination.

You can save the PDF version of the plot (which looks much better than what comes up in my R graphics device) using:

dev.copy2pdf(file = "cfplot_reg.pdf", height=10, width=10)

If you want to view the R script:

file.show(system.file("demo/cfplot_reg.R", package = "migest"))

In Section 5 of our Vienna Institute of Demography Working Paper I provide a more detailed breakdown for the R code in the demo files.

A similar demo with slight alterations to the labelling is also available for a plot of the largest country to country flows:

demo(cfplot_nat, package = "migest", ask = FALSE)

cfplot_nat

If you are interested in the estimates, you can fully explore in the interactive website (made using d3.js) at http://global-migration.info/. There is also a link on the website to download all the data. Ramon Bauer has a nice blog post explaining the d3 version.

Publication Details:

Abel, G.J. and Sander, N. (2014). Quantifying Global International Migration Flows. Science. 343 (6178), 1520–1522.

Widely available data on the number of people living outside of their country of birth do not adequately capture contemporary intensities and patterns of global migration flows. We present data on bilateral flows between 196 countries from 1990 through 2010 that provide a comprehensive view of international migration flows. Our data suggest a stable intensity of global 5-year migration flows at ~0.6% of world population since 1995. In addition, the results aid the interpretation of trends and patterns of migration flows to and from individual countries by placing them in a regional or global context. We estimate the largest movements to occur between South and West Asia, from Latin to North America, and within Africa.

Forecasting Environmental Immigration to the UK

A couple of months ago, a paper I worked on with co-authors from the Centre of Population Change was published in Population and Environment. It summarised work we did as part of the UK Government Office for Science Foresight project on Migration and Global Environmental Change. Our aim was to build expert based forecasts of environmental immigrants to the UK. We conducted a Delphi survey of nearly 30 migration experts from academia, the civil service and non-governmental organisations to obtain estimates on the future levels of immigration to the UK in 2030 and 2060 with uncertainty. We also asked them what proportion of current and future immigration are/will be environmental migrants. The results were incorporated into a set of model averaged Bayesian time series models through prior distributions on the mean and variance terms.

The plots in the journal article got somewhat butchered during the publication process. Below is the non-butchered version for the future immigration to the UK alongside the past immigration data from the Office of National Statistics.
imm2
At first, I was a bit taken aback with this plot. A few experts thought there were going to be some very high levels of future immigration which cause the rather striking large upper tail. However, at a second glance, the central percentiles show a gentle decrease where these is only (approximately) a 30% chance of an increase in future migration from the 2010 level throughout the forecast period.

The expert based forecast for total immigration was combined with the responses to questions on the proportion of environmental migrants, to obtain an estimate on both the current level of environmental migration (which is not currently measured) and future levels:
env4

As is the way with these things, we came across some problems in our project. The first, was with the definition of an environmental migrant, which is not completely nailed on in the migration literature. As a result the part of the uncertainty in the expert based forecasts are reflective of not only the future level but also of the measure itself. The second was with the elicitation of uncertainty. We used a Likert type scale, which caused some difficulties even during the later round of the Delphi survey. If I was to do over, then this I reckon problem could be much better addressed by getting experts to visualise their forecast fans in an interactive website, perhaps creating a shiny app with the fanplot package. Such an approach would result in smoother fans than those in the plots above, which were based on interpolations from expert answers at only two points of time in the future (2030 and 2060).

Publication Details:

Abel, G.J., Bijak, J., Findlay, A.M., McCollum, D. and Wiśniowski, A. (2013). Forecasting environmental migration to the United Kingdom: An exploration using Bayesian models. Population and Environment. 35 (2), 183–203

Over the next 50 years, the potential impact of environmental change on human livelihoods could be considerable, with one possible consequence being increased levels of human mobility. This paper explores how uncertainty about the level of immigration to the United Kingdom as a consequence of environmental factors elsewhere may be forecast using a methodology involving Bayesian models. The conceptual understanding of forecasting is advanced in three ways. First, the analysis is believed to be the first time that the Bayesian modelling approach has been attempted in relation to environmental mobility. Second, the paper considers the expediency of this approach by comparing the responses to a Delphi survey with conventional expectations about environmental mobility in the research literature. Finally, the values and assumptions of the expert evidence provided in the Delphi survey are interrogated to illustrate the limited set of conditions under which forecasts of environmental mobility, as set out in this paper, are likely to hold.

Global Bilateral International Migration Flows

A few months ago, Demographic Research published my paper on estimating global migration flow tables. In the paper I developed a method to estimate international migrant flows, for which there is limited comparable data, to matches changes in migrant stock data, which are more widely available. The result was bilateral tables of estimated international migrant transitions between 191 countries for four decades, which I believe are a first of kind. The estimates in an excel spreadsheet are available as a additional file on the journal website. The abstract and citation details are at the bottom of this post.

My migest R package contains the ffs function for the flows-from-stock method used in the paper. To demonstrate, consider two hypothetical migrant stock tables I use in the paper, where rows represent place of birth and columns represent place of residence. The first stock table represents the distributions of migrant stocks at the start of the period. The second represents the distributions at the end of the period.

> # create P1 and P2 stock tables
> dn <- LETTERS[1:4]
> P1 <- matrix(c(1000, 100, 10, 0,
+                55, 555, 50, 5, 
+                80, 40, 800, 40, 
+                20, 25, 20, 200), 
+              nrow=4, ncol=4, byrow = TRUE,
+              dimnames = list(pob = dn, por = dn))
> P2 <- matrix(c(950, 100, 60, 0, 
+                80, 505, 75, 5, 
+                90, 30, 800, 40,
+                40, 45, 0, 180),
+              nrow=4, ncol=4, byrow = TRUE,
+              dimnames = list(pob = dn, por = dn))
> # display with row and col totals
> addmargins(P1)
     por
pob      A   B   C   D  Sum
  A   1000 100  10   0 1110
  B     55 555  50   5  665
  C     80  40 800  40  960
  D     20  25  20 200  265
  Sum 1155 720 880 245 3000
> addmargins(P2)
     por
pob      A   B   C   D  Sum
  A    950 100  60   0 1110
  B     80 505  75   5  665
  C     90  30 800  40  960
  D     40  45   0 180  265
  Sum 1160 680 935 225 3000

When estimating flows from stock data, a good demographer should worry about births and deaths over the period as these can have substantial impacts on changes in populations over time. In the simplest example using the above hypothetical example above, I set births and deaths to zero (implied by the equal row totals, the sum of populations by their place of birth) in each stock table. In any case I need to create some vectors to pass this information to the ffs function.

> # no births and deaths
> b <- rep(0, 4)
> d <- rep(0, 4)

We can then pass the stock tables, births and deaths to the ffs function to estimate flows by birth place, contained the mu element of the returned list.

> # run flow from stock estimation
> library("migest")
> y <- ffs(P1=P1, P2=P2, d=d, b=b)
1 46 
2 0 
> # display with row, col and table totals
> addmargins(y$mu)
, , pob = A

     dest
orig    A   B  C D  Sum
  A   950   0 50 0 1000
  B     0 100  0 0  100
  C     0   0 10 0   10
  D     0   0  0 0    0
  Sum 950 100 60 0 1110

, , pob = B

     dest
orig   A   B  C D Sum
  A   55   0  0 0  55
  B   25 505 25 0 555
  C    0   0 50 0  50
  D    0   0  0 5   5
  Sum 80 505 75 5 665

, , pob = C

     dest
orig   A  B   C  D Sum
  A   80  0   0  0  80
  B   10 30   0  0  40
  C    0  0 800  0 800
  D    0  0   0 40  40
  Sum 90 30 800 40 960

, , pob = D

     dest
orig   A  B C   D Sum
  A   20  0 0   0  20
  B    0 25 0   0  25
  C   10 10 0   0  20
  D   10 10 0 180 200
  Sum 40 45 0 180 265

, , pob = Sum

     dest
orig     A   B   C   D  Sum
  A   1105   0  50   0 1155
  B     35 660  25   0  720
  C     10  10 860   0  880
  D     10  10   0 225  245
  Sum 1160 680 935 225 3000

The fm function returns the flow matrix aggregated over the place of birth dimension in the mu array.

> # display aggregate flows
> f <- fm(y$mu)
> addmargins(f)
     dest
orig   A  B  C D Sum
  A    0  0 50 0  50
  B   35  0 25 0  60
  C   10 10  0 0  20
  D   10 10  0 0  20
  Sum 55 20 75 0 150

….and there you have it, an estimated flow matrix that matches the changes in the stock tables whilst controlling for births and deaths. In the paper I run the code on real migrant stock data provided by the World Bank, to estimate global migrant flow tables.

The ffs function has some different methods to control for deaths in the estimation procedure. The estimation is based on a three way iterative proportional fitting scheme to estimate parameters in a log-linear model, not to dissimilar to that used in a paper based on my Southampton M.Sc. dissertation.

Publication Details:

Abel, G. J. (2013). Estimating global migration flow tables using place of birth data. Demographic Research, 28, 505–546. doi:10.4054/DemRes.2013.28.18

International migration flow data often lack adequate measurements of volume, direction and completeness. These pitfalls limit empirical comparative studies of migration and cross national population projections to use net migration measures or inadequate data. This paper aims to address these issues at a global level, presenting estimates of bilateral flow tables between 191 countries. A methodology to estimate flow tables of migration transitions for the globe is illustrated in two parts. First, a methodology to derive flows from sequential stock tables is developed. Second, the methodology is applied to recently released World Bank migration stock tables between 1960 and 2000 (Özden et al. 2011) to estimate a set of four decadal global migration flow tables. The results of the applied methodology are discussed with reference to comparable estimates of global net migration flows of the United Nations and models for international migration flows. The proposed methodology adds to the limited existing literature on linking migration flows to stocks. The estimated flow tables represent a first-of-a-kind set of comparable global origin destination flow data.

Estimation of international migration flow tables in Europe

A paper based on my Ph.D. has been published in the Journal of the Royal Statistical Society: Series A (Statistics in Society). It is essentially a boiled down version of my Ph.D. thesis without some of the earlier chapters. The idea was to come up with some comparable estimates of bilateral migration flows, which currently do not exist. I used some modern optimisation methods to harmonise existing migration flow data, and then the EM algorithm to derive some model based imputations where there is no existing flow data. Below are the results I got for the EU15, 2002-2006 (use the tabs at the bottom to view different years).


If you want to download the data, go to the Google spreadsheet here.

Publication Details:

Abel, G. J (2010) Estimation of international migration flow tables in Europe. Journal of the Royal Statistical Society: Series A (Statistics in Society), Volume 173 Issue 4, Pages 797–825.

A methodology is developed to estimate comparable international migration flows between a set of countries. International migration flow data may be missing, reported by the sending country, reported by the receiving country or reported by both the sending and the receiving countries. For the last situation, reported counts rarely match owing to differences in definitions and data collection systems. We report counts harmonized by using correction factors estimated from a constrained optimization procedure. Factors are applied to scale data that are known to be of a reliable standard, creating an incomplete migration flow table of harmonized values. Cells for which no reliable reported flows exist are then estimated from a negative binomial regression model fitted by using an expectation–maximization (EM) type of algorithm. Covariate information for this model is drawn from international migration theory. Finally, measures of precision for all missing cell estimates are derived by using the supplemented EM algorithm. Recent data on international migration between countries in Europe are used to illustrate the methodology. The results represent a complete table of comparable flows which can be used by regional policy makers and social scientists to understand population behaviour and change better.

International Migration Flow Table Estimation

International migration flow data is a messy topic. No single pair of countries defines migration in the same way. Even if the did they most likely measure if differently. This causes some big headaches to anyone who wants to create any inference about migration levels, directions, policy implications or the cause and consequences of people’s movements at a cross national level. During my Ph.D. I worked on methods for estimating comparable international migration flows across multiple European countries.

I identified two fundamental data problems: inconsistency (countries with conflicting reports on the number of people moving between them) of and incompleteness (countries not providing any data). I applied both mathematical and statistical methods to create comparable set of international migration flow estimates. For more details see my Ph.D. dissertation (which is online, see the link below). It contains most of the R/S-Plus code to conduct the estimation in the Appendix. Note, there is also a published paper based on my Ph.D. (abstract and links here). I created a TeX template for the University of Southampton School of Social Sciences here.

Publication Details:

Abel, G. J. (2009). International Migration Flow Table Estimation. University of Southampton, Division of Social Statistics, Doctoral Thesis.