Senior statistician explains China's GDP accounting
Xu Xianchun talks about the improvement and limitations of Chinese government statistics and how to read them.
There is no shortage of suspicions and questions directed toward China’s official statistics and the government organ behind it - the National Bureau of Statistics. Recently, Xu Xianchun, currently a Research Fellow at the National School of Development at Peking University and formerly a senior statistician published the book 透视中国政府统计数据:理解与应用 A Probe into China’s Official Statistics: Comprehension and Utilization. Between Peking University and his government experience, Xu was a Professor at the Department of Economics, School of Economics and Management, Tsinghua University.
The following is an interview of Xu with the Economic Observer newspaper, in which Xu illustrated how government statistics become prone to skepticism. His main ideas are summarized into chapter titles for better understanding.
Diverging views on growth figures are a natural phenomenon
Q: There have been skeptical voices about the growth figures of China in the first half of 2023. What is your opinion on that?
The disparity between people's intuitive feelings and government statistical data is not uncommon. Take disposable income per capita as an example. The official figures, according to the NBS, were based on a sample group of 160,000 households across the country, ranging from high-income, upper-middle-income, middle-income, lower-middle-income and low-income groups, etc.. Different income groups will feel differently about their disposable income.
High-income groups may find the reported national average too low, as their income likely exceeds that of multiple middle-income households; for the large number of households that earn below the reported average, the official figures can be too high, because the average likely falls above the Median.
Another example is the CPI increase, which is felt more keenly among middle-low income groups. People are usually more sensitive to price changes of daily consumer goods like food, vegetables, fruits, and pork, than to price changes of durable consumer goods like color TVs, refrigerators, washing machines, computers and cars. When the CPI is being calculated, however, daily consumer goods and durable consumer goods are put into the same basket. If the CPI increase slows down, it is probable that the prices for durables have decreased while prices for nondurables remain steady or even on the rise. Lower-middle income families, who buy very few durables, may feel this mismatches their perception of rising nondurable prices.
Disparity in perception is common
Q: Are there similar phenomena in foreign countries?
Different groups having different feelings about statistical data is an issue that can be found in many countries. In 2003, the then Director of the National Bureau of Statistics of China (NBS) and I were on a visit to the Italian National Institute of Statistics(ISTAT). The Director of ISTAT asked us bluntly, "Do you know why people are protesting on the streets?" We had no idea. Then he explained that the demonstrators felt the reported economic growth mismatched their state of joblessness - They believed the statistics were fake.
The other side of the story, said the director, is that the Italian economy had long resumed its growth, but the data lagged the true recovery. This goes to show that even in developed Italy, a divide persists between public sentiment and official statistics.
How did contradictory data come about?
Q: Many people are skeptical of the official figures because they find distinct inconsistencies among data from different sources. “The data is at war with each other,” so they say.
There are various circumstances where data inconsistency may occur. The aggregate of regional GDP, for instance, has long been significantly higher than the national GDP calculated by the NBS. This has led to many questions, "How so? Isn't the national GDP figure aggregated from regional GDP?"
The truth is, the national GDP is not an aggregation of regional GDP, but rather the sum of value added across all industries. Before the 4th National Economic Census in 2018, China followed a hierarchical accounting system. That is, the national and regional GDP was calculated by the NBS and regional statistics bureaus respectively. And it was not until 2012 that data uploading and data syncing were mandated for all companies (企业一套表联网直报统计调查制度). Some less developed regions, therefore, might be incentivized to interfere with official figures.
Things improved after the 4th National Economic Census:
The NBS, while calculating the national GDP, now oversees the calculation of provincial-level GDP; provincial-level bureaus of statistics oversee the calculation of city-level GDP; city-level bureaus of statistics oversee the calculation of county-level GDP. There should be no significant difference between the national GDP and the aggregated regional GDP nowadays.
China has established a uniform platform of online data reporting for enterprises. Large industrial enterprises, certified construction enterprises, wholesale and retail enterprises above designated size, accommodation and catering enterprises above designated size, and real estate enterprises now directly report statistics to an online platform; erroneous data must be traced back to and amended only by corresponding enterprises, thus preventing intermediate manipulation.
Unlike in the past, small enterprises are also sampled for investigation.
For data collected between two economic censuses, the GDP is generally benchmarked on the data of the latter census and revised accordingly. This ensures consistency and comparability between regular annual GDP and census GDP.
Every quarter, the NBS invites the leaders and professionals of various departments for a GDP data review and evaluation meeting. Any inconsistencies between GDP and specialized statistics, or among specialized statistics, must be investigated and problematic data corrected. Professional statisticians are also dispatched on a regular basis to audit local data quality and root out any deviation from reality. In 2017, the NBS set up a law enforcement and supervision bureau to enforce laws against statistical manipulation, punishing violations to ensure data integrity.
Key statistical indicators are misunderstood
Q: Is it possible that key statistical indicators are misinterpreted by the skeptics?
Statistics requires professional and technical expertise. For key indicators like disposable income, public understanding often differs from statistical definitions. For many people, the calculation of disposable income may seem like a simple process. But disposable income in government statistics is based on two different sources: 1) flow of funds accounts; 2) household surveys. The two statistical indicators differ in basic uses, scope, data sources, and calculation methods. Naturally, the data will differ. For example, from 2018 to 2020, the disposable income based on capital flow was about 1.3 times higher than that based on household surveys. In some years, it was as high as 1.4 times.
Another example is the GDP. Many people talk about GDP very often but few really understand its basic theory and accounting methods. In the first quarter of 2020, according to the NBS, China's economy declined by 6.9% year-on-year, while retail sales of consumer goods declined by 19%, fixed asset investment decreased by 16.1%, and trade surplus dropped by 80.6%. How was it that the GDP fell just by 6.9%, some asked, when consumption, investment, and trade all saw double-digit drops? Then they concluded that the drop in GDP must have been underestimated.
In reality, what these critics failed to grasp is the nuances. They simply equate changes in retail sales with consumption, investment figures with investment demand, and trade surplus with net exports. Let's take consumer demand as an example. It includes household consumption expenditure and government consumption expenditure. Household consumption expenditure includes monetary expenditure and non-monetary expenditure such as grains, vegetables, and fruits that farmers produce and consume by themselves. Non-monetary expenditure, also known as non-cash expenses, is part of household consumption, but because it does not enter the market and is not transacted with money, they are omitted from the retail sales figures. It is the same case with imputed rent, i.e., the estimated rent an individual would pay for houses they own. These assets do not count as commodities and are therefore excluded from the total retail sales of consumer goods. Non-monetary expenditure was mostly unaffected by the pandemic and saw an increase in Q1 2020, while at the same time monetary expenditure fell.
Government consumption expenditure includes public service expenditure and expenditure of individual consumption. In the first quarter of 2020, public service expenditure declined slightly. The resilience of non-monetary expenditure and public service expenditure in the first quarter of 2020 moderated, to some extent, the broader decline in household and government demand as a result of COVID-19. Therefore, consumer demand did not decline as sharply as the total retail sales of consumer goods did.
Harnessing the power of statistics
Q: Can you give us some suggestions on using government statistics?
Government statistics are crucial economic and social resources. They are invaluable for assessing the economy, guiding policy, and research. To use them correctly, four key principles must be followed:
First, choose the right government statistics according to the issue being studied. Keep the following three principles in mind: representativeness, quality, and consistency.
Second, understand the classification criteria, survey scope, survey methods and collection methods of government statistics. They evolve with the economy, so track changes to ensure proper use.
Third, understand the scope and calculation methods of government statistical indicators, some of which have seen significant changes over time. For instance, the compensation of employees, one of the indicators in the income approach to calculating GDP, has undergone two major adjustments. Failure to take into account these changes will directly undermine research rigor.
Fourth, grasp the scope of application and interrelationships of indicators. This enables sound analysis when ideal data is unavailable.
Keeping track of data revisions
Q: Any suggestions on using historical statistcis?
That the statistical system may be different is the most important consideration when using historical data. With statistical methods evolving with the development of the economy, the definition, scope, data source, calculation method, application scope, etc. of statistical indicators can all change. This is especially true for GDP accounting.
To ensure continuity and comparability, the NBS systematically revises affected historical data. One must be careful to always use the revised data. The NBS has established a national database (国家数据库) which contains all of the revised data. But data retrieved from early statistical yearbooks is likely unrevised, so extra attention is needed when using them. When adjusting figures yourself, thoroughly research any scope, source, or methodology changes over time. I suggest referring to professional statistical yearbooks, as they provide more specific details and therefore suitable for data revision.
Future of GDP accounting
Q: Can you give us an example where the meanings and scope and meanings of statistical indictors have been altered due to the development of economy and society?
Gross fixed capital formation (GFCF) in the expenditure approach to calculating GDP is a typical example of scope changes over time. Before 2008, the System of National Accounts (SNA) categorized R&D spending as intermediate input, which was excluded from the GDP. From 2008, though, R&D was reclassified as GFCF, which was included in the GDP.
The current SNA still excludes data capital expenditure (CapEx) from the GFCF, but data assets are clearly growing in importance, having already become the most valuable holdings of platforms like Didi. Their inputs for data collection, storage, development, and maintenance have surpassed those for equipment, office buildings, etc. While some enterprises undergoing digital transformation are seeing major cuts in traditional fixed asset investment, their data CapEx is growing more than rapidly. This trend in digital transformation was especially pronounced in me and my students' survey on more than 80 enterprises in 11 provinces, mostly in affluent provinces like Guangdong, Shanghai, Beijing, Jiangsu, and Zhejiang. Although the question of how data CapEx should be treated in GDP accounting is still debated, the rising economic role of data in the digital era is undisputed. Statistical theory must keep pace to capture the impact of data assets on economic development.
The exponential growth of data and rapid development of the digital economy pose a series of statistical challenges to the government: How do we measure the value of data assets and their contribution to economic development? How can we comprehensively and objectively measure the value added of the digital economy?
The value created by free or low-cost internet services, such as search, navigation, messaging, online booking, etc., is underestimated or even omitted in the current approach to GDP calculation. Similarly, jobs generated by China's gig economy, including ride-hailing drivers and food delivery riders, do not come with labor contracts with the platform companies. So the value added by the more flexible forms of employment also falls outside GDP measurement.