Very large or complex data sets that can not be managed or analyzed by conventional methods are called big data. The 3V expression, which consists of the initials of the words volume, velocity and variety, expresses the characteristics of the big data. Big data can carry one or several of these characteristics at the same time. It is imperative that the data must be reliable to produce statistics, to implement machine learning algorithms, and to make future predictions from big data.
With the development of technology, the use of many electronic devices such as mobile phones, computers, sensors has become widespread. With the Internet becoming more common and faster, technology has become an indivisible part of most of the world's population. Searches made through search engines, photos and thoughts shared on social media, and recordings made through e-commerce sites resulted in a significant increase in the volume of data held electronically.
The owner of this growing data is mostly private companies. The companies that based on the "knowledge is power" principle have started to become stronger. For example, a search engine site has successfully predicted which countries have the flu epidemic and when the flu epidemic will occur in which countries, according to the searches made.
The use of big data has also started to increase in Turkey, especially proposal systems in the field of e-commerce and fraud detection in the field of banking sector as the applications.
The Importance of Big Data on Official Statistics Production
The use of big data sets, which enable them to acquire important strategic information when interpreted with accurate analysis methods, has become even more important in an increasingly competitive environment with globalization.
Using the advantages of big data, it can produce faster, more effective and more statistics. In addition, in economic aspect using the big data offers the possibility to reducing the statistical production costs in the long term. In this sense, the use of big data in official statistics has begun to become attractive in terms of the world's leading statistical offices.
Research and development (R&D) work on big data in TURKSTAT has been accelerated since November of 2013. In order to analyze big data, sub-structural preparations and learning studies on various data are being carried out.
The use of big data as auxiliary data in official statistics will also improve the quality of traditional available statistics. In addition by using of big data as auxiliary data, the number of question in the survey or the number of survey can be reduced and potential new information can be discovered from the big data sources.
In the international arena statistical institutions are continuing to investigate whether the big data can be used in data production.