Data Capability Maturity Model

Capability Maturity Model applied to data helps an organisation to determine the capability for making data assets FAIR.

  • The method defines five levels of maturity for how an organisation makes and maintains FAIR data.


The phenomenal growth of Life Science data makes it more important than ever to invest in support for the life cycle of data management which is enhanced by making the data FAIR, being Findable, Accessible. Interoperable and Reusable [1]. This method applies the Capability Maturity Model (CMM) to the transformation of FAIR data which considers maturity levels as key process steps. This will help an organisation to determine the optimal level for FAIR transformation of data or making data FAIR by design.

The CMM method recognises five levels of maturity for making data FAIR data. It is used to develop and refine the process for software development, first described in IEEE Software [2] by Humphrey Watts. CMM has since been evolved and commercialised by Carnegie Melon University as the CMMI Institute to better integrate with the strategic goals of an organisation [3]. CMM can aid in business processes generally and has also been used extensively worldwide in government offices, commerce, and industry [4]. It has been applied by the CMMI institute to provide the Data Management Maturity (DMM) program under commercial license [5]. Existing maturity assessment models applied to data stewardship has been reviewed as three types of maturity matrix for geographic and climatology data [6].  Most recently, CMM has been adapted by the FAIRplus IMI consortium [7] to improve an organisation’s life science data management process, which is the basis for the method described here.

The FAIR data CMM method identifies 1) important organisational aspects of FAIR data transformation and management, 2) a sequence of levels that form a desired path from an initial state to maturity and 3) a set of maturity indicators for measuring the maturation levels. Implementation of the CMM method is likely to involve big organisational change which requires commitment from senior leaders to make this happen. It enables prioritisation for the different stages of investment towards generation of FAIR data, as shown in Figure 1 below.

Figure 1: The data Capability Maturity Model (Source: Adapted from FAIRplus[7])

How To

CMM applied to FAIR data management determines the maturation steps on a pathway to support Findability, Accessibility, Interoperability and Reusability at an optimal level of granularity for machines and humans.

The maturation steps or stages of FAIR data CMM are as follows:

  1. Initial: Data reuse is not possible outside of the project or department who produced the data sets. No long-term solutions for data sustainability and access.
  2. Repeatable: Data usage is limited, only possible with help of experts who are involved in the project and requires manual effort. Domain experts help are required to interpret the data. Data access is governed mostly by the project owners.
  3. Defined: FAIR data sets can be utilized by other parties with a minimal effort. Organizational and community standards are utilized, variations are documented. Linking data sets can be achieved with some mapping effort. Access and sustainability processes are well defined.
  4. Managed (Capable): Data is linked. Metadata follows (canonical) community standards, when not the variations are explicitly documented and mapped. Organization wide services are available for searching and accessing FAIR data sets. Machine accessibility and data linking is fully achieved.
  5. Sustainable (or Optimising for efficiency): The data can be reused cross organizations/communities with a minimum of effort.  FAIR data can be maintained FAIR throughout time. Usage of FAIR data is monitored.

The FAIR Maturity Indicators are applied to assess the starting level of FAIR maturity, as it is now. Following this, feasibility analysis is undertaken to take account of the different levels of data granularity: 1) data catalogue 2) data collection 3) data set 4) data content and 5) context for the data. See the Toolkit method on granularity and context for more detail about feasibility analysis. Selection of data for FAIR transformation should include satisfaction of scientific and business requirements or questions in the feasibility analysis to understand the target for a desired level of FAIR transformation. This will guide and inform the practical steps to improve the FAIRness of the data as part of a cycle of continuous improvement, as illustrated in Figure 2 below.

Figure 2: The FAIR CMM cycle for continuous improvement (Source: Adapted from FAIRplus[7])

References and Resources

  1. Wilkinson et al The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data volume3, Article number: 160018
  2. Humphrey, WS 1988. “Characterizing the software process: a maturity framework”. IEEE Software. 5 (2): 73–79. DOI: 10.1109/52.2014
  3. CMMI® Institute and CMMI® V2.0
  4. “What is the Capability Maturity Model? (CMM) | Process Maturity | FAQ”. Source:
  5. CMMI® Institute Data Management Maturity (DMM) program
  6. Peng G 2018 The State of Assessing Data Stewardship maturity – An Overview. Data Science Journal, 17: 7, 1-12. DOI: 10.5334/dsj-2018-007
  7. FAIR Capability Maturation Model. Oya Beyan, Susanna Sansone and Carole Goble for the FAIRplus IMI consortium public slides: FAIR Capabilities Maturity Model – public.pdf

At a Glance

Use cases

All of the use cases of this FAIR Toolkit

Related methods
  • An enabler for an organisation to determine it’s investmant in capability to make and maintain FAIR data assets
  • Data stewards
  • Scientists who collect the data
  • Science management
  • Sustained effort to reach optimal maturity of capability for FAIR data management
  • Medium to high

Top Tips

  • Implementation of the CMM method is likely to involve big organisational change which requires commitment from senior leaders to make this happen.
  • Capability Maturity Modelling can be used for three purposes:
  1. Descriptive to assess the current state of FAIR data maturity
  2. Prescriptive to develop a plan for improvement to a desired future state of FAIR data maturity
  3. Comparative for benchmark comparisons of FAIR maturity between datasets