Data Economy Actors
Data Use Cases
Data users may be end-users, which exploit own or third-party data for decision making, or other users, which exploit data to provide value-added data, services or products. If own data is used, the data user is also a data holder. Data users often use technologies and services provided by solution providers supporting data processing, aggregation, analytic and visualisation but it is also not unlikely that they conduct these activities without third-party support.
Concerning content, three different types of data were identified. Actor data includes person data (e.g., address, medical details, transactions, communications) and organisation data (e.g., address, business facts, position, transactions, communications). Device/service data contains among others usage, configuration and status data. Environment data refers, for instance, to nature (e.g., trees in a forest), public infrastructure (e.g., streets in a city) or economic data.
With respect to the format of data, a very general differentiation is made between structured and unstructured data. However, a more thorough understanding of the format is essential to allow the exploitation of data. The interpretation of data is only possible for an entity if it is known how the data was encoded for storage or transmission. XML, CSV, JSON and RDF are data formats often used to exchange data.
Purpose refers to the main goal for which data is intended to be used. Audience data is an example for data described with respect to its purpose dimension. In terms of content, audience data is actor or device/service data. Its purpose is to provide insight into the audience of some activity.
The position in the value chain of data indicates what has happened with the data so far as well as what are likely steps to follow to exploit the data. Terms used to describe data based on this dimension include raw data, for instance.
Technologies are key assets of data economy’s actors as they are the resources from which future economic benefit is expected. Technologies contribute to the economic value generation by exploiting the potentials of the data available. Tools are considered as compositions of technologies that serve a specific purpose.
The purpose identifies the main goal of a technology. For example, data storage solutions usually help to preserve and manage data and its lifecycle. Other purposes include to provide security (e.g., AES, TLS) or to allow data exchange (e.g., JSON, XML) or notifications (push, pull solutions). Data economy actors usually combine multiple technologies.
Openness describes the possibility to access and extend technologies. Technologies may be fully proprietary, commercial and closed, or follow an open-source approach or be positioned somewhere in between those extremes.
Usage reflects the distribution of technologies, which is often represented by the number of users. Technologies may be standardised but still barely used in practice, while others may be de-facto standards. Examples are XML and JSON, which are used by most data economy actors these days and thus have a large usage within the market.
Although simplicity is hard to measure and partially subjective, it has been found that technologies that are usable in an easy way are typically used more often. For example, a major success factor of REST services has been their simplicity in contrast with other formats.
Trends and new approaches play a key role in the data economy. Modern and trendy technologies are often quickly taken up and used in contrast to old and sometimes more mature technologies. In data storage, for instance, NoSQL solutions such as MongoDB have gained popularity over traditional storages over the last couple of years. However, trends can also be identified in newer fields such as virtualisation.
Strategies describe how agents react to their surroundings and pursue goals. They include deliberate choice but also patterns of response that pursue goals with little or no deliberation. Assessment of the success of own actions and the actions and success of other actors influence the change of strategies. Processes of reproduction and copying play an important role in the context of strategies. In the context of the data economy, strategies of actors are closely related to their business models.
Actors focusing on data acquisition typically address very specific data in terms of content and their value propositions emphasise the access to this data. They publish raw data or interpretations of data, provide better access to or search engines for data or run platforms for data exchange. Demand-oriented pricing strategies are typically used for revenue generation, where the fee is sometimes linked to specific indicators. It also happens that data holders must pay actors to make their data available.
Actors using data manipulation as core of their business model typically provide technologies or services for generating, analysing, visualising, managing or enriching data. In terms of value proposition, they predominately stress the performance, design or usability of their offers. With respect to revenue generation, subscription fees are typically charged. Implementing a premium model is quite common, while some actors implement a freemium model.
Actors focusing on data exploitation do not only manipulate third-party data or provide technologies to do so but also exploit data themselves. They typically use data to create new products and services, improve existing ones, add data to non-data products or produce market analyses, surveys, plans and reports. Most actors charge their customers subscription fees. Actors usually highlight the newness of their offers, their experience in the field or performance aspects in their value proposition.
Actors using technology provision as basis of their business model typically provide technology-based services (e.g., a cloud-based analytics service) or technologies (e.g., a NoSQL storage solution). For technology-based services, subscription fees are typically charged, while in case of technology the products themselves are often made available free of charge and fee-based services are offered related to them. Nevertheless, there are also quite some technologies or tools that need to be licensed or bought.
Actors focusing on consultation typically provide consulting on how to benefit from data, how to build a successful data-based business model or how to use data technology, data-related trainings or courses, or data skill management. The value propositions are usually focused on the experience the respective actor has. Consulting companies typically charge a fee for their services. The amount of the fee is usually fixed individually.
An important quality criteria of data is its timeliness (in the big data context: volatility). It describes the length of time until the data is available to users. If a data is outdated it can still be used for history analysis but not for decision making. Timeliness is affected by how fast the ICT updates the data after an event happened and in which interval it makes updates. Providing a “best before date” or a best before condition to a data set would help data users to assess the quality of the data.
The use and reuse of data could be increased if certifications provided guarantee for data consumers regarding the quality of the data, which could lead to trust regarding the source and platforms, the data suppliers and the quality of the data itself. Research organisations need to meet several ISO standards regarding data management and security but other organisations don’t. Establishing trust regarding a company or its data-based product by using a certified product label could supplement it.
The lack of trust in big data technologies and data reuse is often linked to data protection and privacy related fears. Making all citizens aware of privacy and security rules would fill this gap but is not realistic. However, making companies that earn money with data-based services responsible for security and privacy might be a working solution. A seal of quality could offer a guarantee for privacy and protection of data. This may, however, increase the hesitation of companies to share data.
As data may be combined from different sources and may be the base for decisions, it is important to provide a detailed auditing functionality allowing companies to log the source where the data originally came from and when or how or by whom it has been altered. This is important as otherwise data cannot be seen as a trustworthy source of information. Approaches such as electronic signatures or block chains may be promising approaches to achieve this.
There are different types of ICT infrastructures, databases and interfaces that have been developed over time. Standards regarding technologies and processes of big data have not been fixed yet, which resulted partially competing standards. Concrete standards need to be clearer promoted at EU level to ensure the free flow of data (the end of rooming fee was the first step). In addition, making data available in unfavourable formats e.g. PDF risks that data can be combined and reused by others.
Cloud-storage and cloud computing services guarantee a certain service quality when service-level agreements are signed, which identify responsibilities, rights and obligations between the service provider and the user. In the last years, the service-based approaches like infrastructure as a service or software as a service have become more and more popular. This allows companies to consume external ICT infrastructure and reduce the need to create and maintain an own ICT infrastructure.
Using third-party services might lead to dependency. Dependency in terms of availability can be partly solved with SLAs, which describe a set of guarantees related to a service. SLAs in terms of data-driven services are not that common and need to be developed. Switching a vendor usually causes high switching costs for a customer, which might result vendor lock-ins. This might be critical in the data economy as very specific data might be offered only by few or even one holder.
Agents tend to benefit from equivalency of transmission speed and Internet access structure all over Europe.
The high-speed broadband Internet network is a fundamental technical requirement of the data economy. The available average speed and guaranteed broadband connection in Europe depends on the national efforts which are influenced by regulations and straightforward guidelines of the EC. However, this issue seems to lose its importance the new fibre generation might make it relevant again.
Agents tend to benefit from a reduction in the technological gap between SMEs and large enterprises.
The data economy is heterogenous. Big companies distinguish themselves by having large financial, human and material capacities, which result strategic advantage to them. SMEs on the contrary often lack in those resources and in the possibilities to cope with latest technology trends. To compensate this beside the governmental financial subsidies the wide spreading of cloud services might provide a solution, which allow SMEs to use latest technologies without hiring expensive consultants or expert staff.
The costs and risks for data users associated with using third-party data (e.g., due to lack of data provenance) as well as lack of data reuse culture and difficulties with respect to determining a fair price for data are considered as major hindering factors to data reuse. Data users tend to use data only if it is available for free. The amount of freely available data changes the roles and forces in the market and affects the development of data-based applications and the data economy itself.
Value propositions of data economy actors, particularly of data processors, data users and data holders, frequently include terms such as faster, better or more. Without doubt, performance is important in the context of big data. However, it is only one of several important factors and does not seem to be among the key challenge to be overcome. Additionally, it does not seem to be a good idea to use the same key element as most of the competitors do to create a lasting impression.
Without doubt, there are hundreds of areas affected by data. In Europe, among for-profit organizations, there is a significant concentration of applications focusing on marketing-related data reuse. Marketing seems to be an area for which not only data reuse is highly relevant in theory but also one where data reuse is already happening a lot in practice, at data users as well as at data processors and holders. Multi-channel marketing, customer targeting or market analysis are typical examples.
Purchase decisions are typically made through an assessment of a myriad of factors balancing perceptions of value components against price in a subtle, complex, and often sub-conscious decision matrix. This equally applies for data processors, users, holders and distributors. In the data economy, customer-centric pricing leads, for instance, to the implementation of the premium model as well as to dual pricing, which are both used due to the difficulty to determine a fair price for data.
Trust should not be underestimated as a social factor affecting data reuse. Certifications, stamps or seals might promote the trustworthiness of data. In Europe, the French DPA Commission has established a procedure for certifying industrial processes to make it easier to access and use data without impinging on civil liberties. Such a certification system is supposed to help data processors, users and distributors to achieve a common understanding of the data and thereby fostering trust.
Finding the desired data is often a challenging task, particularly if a data user needs very special data. Data distributors reduce the efforts for data users and holders by building the bridge between them and increasing the likelihood that they come together and exchange data. Data distribution platforms, which may be chargeable, provide access to different set of data. Most data distributors are based outside of Europe and focus on very specific types of data in terms of content and format.
The focus of actors on national markets is the result of lacking integration of the European market, despite of all the serious efforts made by EU policy makers. There are many differences between EU Member States that affect the general opportunities and dangers relevant for the day-to-day operations of data economy actors. Legal, socio-economic and technological differences increase the costs and complexity for an actor that is active on more than one European national market.