The file PSS2012_misinvoicing.csv contains the dataset with misinvoicing estimates for all countries for all years, where we were able to make the calculation. It contains 1,359 records, one for each country-year. Each record shows a country name, a year, and a series of misinvoicing measures. The term 'W' is for variables which show misinvoicing w.r.t. the whole world while the term `IC' is used in variables which show misinvoicing w.r.t. industrial countries only. The latter would be more trusted data because of high quality statistical data from industrial countries. On the other hand, maybe a would-be-misinvoicer would do more egregious things when dealing with the port at a country with weak governance, so maybe W is what one should be focusing on. The main data is 8 columns: mi.X.IC is misinvoicing on exports to industrial countries (bln. usd) mi.X.IC.pc is the same, expressed as percent of exports. mi.M.IC is misinvoicing on imports from industrial countries (bln. usd) mi.M.IC.pc is the same, expressed as percent of imports. mi.IC.net is the net capital flow (billion USD) mi.IC.net.pc is the net capital flow as percent of M+X mi.IC.gross is the gross capital flow (billion USD) mi.IC.gross.pc is the gross capital flow as percent of M+X After that, there is the stack of 8 more columns doing the same for world instead of industrial countries. There is one special case: China. We report a country China, a country Hong Kong, and a fictional country "CHK" which is the hypothetical merger between China and Hong Kong, where we remove China-HK trade and think of the trade by the merged entity with the world. Here are the key formulas used (in R code): D$mi.X.IC <- -((D$co.X.to.IC*D$cif.fob.ratio) - D$IC.M.from.co) D$mi.X.IC.pc <- 100*D$mi.X.IC / D$co.X.to.IC D$mi.M.IC <- -((D$cif.fob.ratio*D$IC.X.to.co) - D$co.M.from.IC) D$mi.M.IC.pc <- 100*D$mi.M.IC /D$co.M.from.IC D$mi.IC.net <- D$mi.X.IC + D$mi.M.IC D$mi.IC.net.pc <- 100*(D$mi.IC.net/(D$co.X.to.IC + D$co.M.from.IC)) D$mi.IC.gross <- abs(D$mi.X.IC) + abs(D$mi.M.IC) D$mi.IC.gross.pc <- 100*(D$mi.IC.gross/(D$co.X.to.IC + D$co.M.from.IC)) D$mi.X.W <- -((D$co.X.to.W*D$cif.fob.ratio) - D$W.M.from.co) D$mi.X.W.pc <- 100*D$mi.X.W / D$co.X.to.W D$mi.M.W <- -((D$cif.fob.ratio*D$W.X.to.co) - D$co.M.from.W) D$mi.M.W.pc <- 100*D$mi.M.W /D$co.M.from.W D$mi.W.net <- D$mi.X.W + D$mi.M.W D$mi.W.net.pc <- 100*(D$mi.W.net/(D$co.X.to.W + D$co.M.from.W)) D$mi.W.gross <- abs(D$mi.X.W) + abs(D$mi.M.W) D$mi.W.gross.pc <- 100*(D$mi.W.gross/(D$co.X.to.W + D$co.M.from.W))