Big Data

Big Data (Category) is a field that treats ways to analyze, systematically extract information from, or otherwise deal with data sets that are too large or complex to be dealt with by traditional data-processing application software.

Big data challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating, information privacy and data source.


Big data can be described by the following characteristics:

  • Volume, The quantity of generated and stored data. The size of the data determines the value and potential insight, and whether it can be considered big data or not.
  • Variety, The type and nature of the data. This helps people who analyze it to effectively use the resulting insight. Big data draws from text, images, audio, video; plus it completes missing pieces through data fusion.
  • Velocity, The speed at which the data is generated and processed to meet the demands and challenges that lie in the path of growth and development. Big data is often available in real-time. Compared to small data, big data are produced more continually. Two kinds of velocity related to big data are the frequency of generation and the frequency of handling, recording, and publishing.
  • Veracity, It is the extended definition for big data, which refers to the data quality and the data value. The data quality of captured data can vary greatly, affecting the accurate analysis.

Other important characteristics of Big Data are:

  • Exhaustive, Whether the entire system (i.e., n {\textstyle n} {\textstyle n}=all) is captured or recorded or not.
  • Fine-grained and uniquely lexical, Respectively, the proportion of specific data of each element per element collected and if the element and its characteristics are properly indexed or identified.
  • Relational, If the data collected contains common fields that would enable a conjoining, or meta-analysis, of different data sets.
  • Extensional, If new fields in each element of the data collected can be added or changed easily.
  • Scalability, If the size of the data can expand rapidly.
  • Value, The utility that can be extracted from the data.
  • Variability, It refers to data whose value or other characteristics are shifting in relation to the context in which they are being generated.