Gauging Poverty and Inequality with Big Data Rapidly

September 20, 2021

Written by Uyanga Gankhuyag, formerly UNDP Bangkok Regional Hub Economist, Delgernaran Tumurtogoo, National Project Coordinator and Yasin Janjua, UNDP Mongolia Economist

pilot study conducted in Mongolia shed light on household consumption, poverty and inequality during the pandemic year of 2020.

While health authorities around the world were scrambling to put in place public health measures to control the spread of COVID-19, and medical workers were fighting to save lives, statistics officials were scratching their heads on how to go about their usual work - gathering data – when movements became severely restricted.

UNDP Mongolia teamed up with the National Statistical Office, the National University of Mongolia, the Ministry of Finance and the Tax Data Centre[1] to put to use the Value-Added Tax (VAT) data to track changes in consumption, poverty and inequality in Mongolia.

VAT records as big data and avaluable resource for decision making 

VAT is the largest source of tax revenues in Mongolia. The VAT reform in 2015 introduced incentives for final consumers-payers of VAT in the form of a lottery, and a refund of 20 percent of paid VAT. It also required all businesses with incomes above a certain threshold to use the Point of Sale (POS) machines to record all their sales in a single e-system. The government established the VAT e-system whereby consumers can use a mobile app or a website to register their purchases to get their refunds, and enter into the VAT lottery, while businesses are mandated to register their sales.

As more consumers demanded machine-generated receipts of their purchases, more businesses in the country issued more of such receipts, reducing informal and unrecorded transactions. The number of POS machines in Mongolia more than quadrupled from 12 thousand to over 54 thousand by 2020.

VAT collection increased. Between 2015 and 2019, VAT revenues increased by 136 per cent in nominal terms, and the share of VAT in total tax revenues increased from 20.5 per cent to 25.5 per cent.

In 2020 alone, 931 million e-receipts were printed. The records in the VAT e-system, generated every minute by consumers and businesses making transactions  a real trove of big data.[2]

So why not use this data?

The solution to the problem of not being able to collect data on household incomes during the movement restrictions in the pandemic came in the form of the VAT e-system. By 2020, the majority of adults in Mongolia had an account in the VAT e-system and a large share of them were registering their purchases – or expenditures. So why not use the data on their expenditures to analyse how poverty and inequality is changing?

The team embarked on the study doing just that. It found that spending[3] in Mongolia was generally higher in 2020 compared with the previous year. Poverty and inequality of spending slightly declined - the poverty headcount rate declined by 4.8 percentage points and the Gini coefficient by 0.026 points in 2020 compared with 2019.

Moreover, VAT data enabled tracking changes in spending, poverty, and inequality on a monthly basis. From Figure 1, you can see that spending in Mongolia dips in March and then takes a deep plunge from November 2020. This is consistent with the public health measures and policies taken: lockdowns early in the year, the fiscal stimulus by the government including a sizeable spending on social assistance kicking in from April, and then the start of the local transmission of COVID-19 in November 2020 prompting a spate of new, stricter lockdown measures. The data shows that lockdowns reduced spending, while social protection had an immediate and sizeable effect on increasing spending, especially for the poorest. 

Can VAT data substitute household surveys in the future?

The coverage of the VAT system is high, and is growing in Mongolia. However, VAT data can hide systematic underreporting of expenditure, especially among poorer people and rural residents, resulting in distorted measures of poverty and inequality when VAT data is used alone.

Therefore, the team employed statistical techniques to adjust for such underreporting and improve the representativeness of the VAT data set analyzed. Household surveys conducted by statistical offices are a vital basis for adjusting the VAT data – or any other big data – to address the problem of underrepresented populations. Thus, VAT data in Mongolia only complements household surveys, but can not substitute for them.

Another issue is data privacy and confidentiality. The confidentiality of VAT data and privacy of records is protected by the law in Mongolia, and only a small number of government officials and contractors, bound by a confidentiality agreement, can access this data – as it should be. To analyze data while respecting the confidentiality requirements, the research team implemented a data confidentiality protocol whereby only anonymized, sample data set from VAT data records was used for the analysis. In the future, anonymized VAT data sets can be used in a similar way for research and analysis. Policy makers can benefit from enabling the use of anonymized data for analysis like this.

Rapid tracking of poverty and inequality

Significantly, with the use of VAT data set in Mongolia, poverty and inequality are no longer slow-moving indicators. 

In developing countries, household living standards surveys, used to estimate consumption, income, poverty and inequality, are usually done with annual frequency at best, or once in 5 or even 10 years at worst. Therefore, the indicators of the Sustainable Development Goals (SDGs) on reducing poverty and inequality are not responsive to policy changes. Several years need to pass before effects of any policy on poverty and inequality can be known reliably.

This study showed that using VAT data, changes in poverty and inequality could be measured as frequently as monthly. It can show how poverty and inequality respond to policy changes, such as public spending on social protection or tax-related measures. The type of analysis, when implemented rigorously, can become an important tool to support public policy-making, during and beyond the pandemic. 

Access to the full study here (in English and Mongolian): 

https://bit.ly/3wqe0GC

Full video of the Virtual Launch of the study (in Mongolian):https://fb.watch/v/19pbA86iJ/

Figure 1. Mean expenditure (per person at 2018 prices) estimated by the research based on VAT data.

[1] Full name of the agency is the Information Technology Center for Custom, Taxation, and Finance

[2]Big data is commonly defined as data that is big in terms of volume (a large quantity of data), variety (multiple types of data and unstructured data),  velocity (the speed at which data is created), and high frequency (the data is generated at high time frequency on ongoing basis).

[3] Which is roughly equivalent to consumption, which is roughly equivalent to incomes