Today, the economy is driven by data as the digitalisation of the daily life is going smoothly. However, what enterprises and governments are looking for is not data but knowledge, answers to their daily challenges. Thanks to the high availability of any kind of data in a digital format different industries emerged over new technology for the treatment of that data. The first market was called the Big Data market and it has emerged in early 2000s but it could not show a disruptive success for its players although it has shown deficiency in technologies readiness and business models. The main deficiency of the Big Data Market is the cost of the treatment of the data and the poor ratio between effective data transformation vs. cost.
Lately, there have been emerging different markets based on data in digital format as the Open Data, the software industry for data treatment and analysis, Blockchain and Artificial Intelligence. The main common point for all data industries is their aim to improve the decision making process by objectivity that is base on real data while minimizing the uncertainty.
Another common point is the organization of the data in data bases under some criteria of classification. Then, the differences between them come from the use that is given to that classified data. Big Data market offers enormous volume of data for prediction. Open Data market aims to create new business opportunities for the individuals. Thick Data is oriented to identify trends in the consumer behaviour. Blockchain gives value to the data ownership and its sharing process. Artificial Intelligence aims to discover new knowledge beyond the establish data analysis methods.
In conclusion, all industries related to data need to organise the data in a certain manner and transform it for applications.
Data Transformational Process
In that paper, we define the transformational process of the data to knowledge with the following:
- Through classification the data is converted to information.
- Through crossing different classified data (information), it is converted to knowledge.
Key Factors For Successful Data Transformation
The success of the transformational process of the data to knowledge greatly depends on the choice of the methodology for converting the raw data in information. This choice gives the framework for the definition of the criteria for the classification and organization of the raw data. Additionally, it establishes the base for the next transformational step of the obtained information to knowledge.
The first key factor for the transformation of the information to knowledge lies in the criteria for crossing sets of information. We get understanding of something when we take action over it. We defined the actions over the data as crossing them. Then, we force an action over the data to convert extract its meaning. For example, to compare the studied data set to other ones.
The second key point is the definition of criteria which will structure the action taken over the data sets. For example, if we want to compare to data sets, we need to do this comparison in relation to a reference. A reference can be a common characteristic of the data sets or an external factor and its impact over them.
Digitalisation Of The Data Transformation
Further in our paper, we discuss the digitalization of the data transformational process. We defined it with the following steps:
- Generation/Collection of raw data.
- Criteria definition for the classification of the raw data.
- Classification and organization of the raw data in data sets.
- Criteria definition for data treatment and data analysis.
- Basic data analysis: classification of the data sets for further applications.
- Historical data analysis: identification of interrelationships and processes.
- Extraction of knowledge: validation of hypothesises, visualisation of evidences about undergoing phenomenon.
Our conclusion is that the existent technology cannot provide solutions for the key points for the data transformation (step 4 and 7) which are closely related to the human creativity and common sense. We suggest that the team's skills in criteria definition for classifying and crossing data are of a high importance to asure a high quality knowledge extraction.
To know more about the Data Scientist's skills I recommend the post by Sean McKenna where he has done very a detailed description of all needed classification for successful data transformation process and why it will be hard to be automated.