Data update of World Bank, IADB, and EuropeAid datasets on development aid funded contracts and projects – November 2019

We have released an update in November 2019 on the datasets collected on development projects, public tenders, and contracts for three major donor agencies: the World Bank, the Inter-American Development Bank (IADB), and EuropeAid. The datasets not only republish structured data gathered from official source websites, but also contain corruption risk red flags developed by the research team.

About the project

The project entitled “Curbing Corruption in Government Contracting” analyses how procurement can be manipulated for corrupt ends using a prize-winning ‘red flags’ methodology developed by Mihály Fazekas. We collect datasets of procurement tenders and contracts, with a range of variables that indicate corruption risk, and analyse the data to identify suspicious patterns and trends, by procuring entity, supplier, and over time.

Regarding procurement that uses funds from development aid donors, pressure to ensure accountability and transparency in the allocation of funds has been growing. Yet, donors have only blunt tools available to monitor whether recipient governments use aid for agreed purposes. To address this problem, we developed an innovative methodology for analysing big data from major aid agencies to calculate more accurate and targeted indicators of corruption in aid-funded procurement. We employed these indicators to explore how the risks of corruption in aid allocation are affected by (1) different institutional control mechanisms and (2) the socio-political context in recipient-countries. Our findings hopefully contribute to guiding donor agencies in the future development of more efficient delivery and monitoring mechanisms, while our data analysis tools can be incorporated into donors’ evaluation frameworks on a real-time basis.

Data and documentation

Find the first iteration of these datasets and accompanying source data and documentation here, and the second iteration here.

  • Data mirroring source data: flat csvs for key variables
  • Data (analysis data files with red flags):
    • WB dta (Stata 14, full size: 3.3 GB, compressed size: 125 MB)
    • IADB dta (Stata 14, full size: 524 MB, compressed size: 29 MB)
    • EuropeAid dta (Stata 14, full size: 65 MB)
  • Description of data collection and red flags calculations: PDF
  • Red flags variable list: xlsx
  • Data scraping, parsing, and cleaning codes
  • Combined project and procurement data structure (describing the structure of the structured json database): xlsx
  • Data validation report: PDF, Data quality tables: xlsx