Introduction
This documentation is meant for developers who want to integrate with the JoinData platform. If you want to know more about our mission and how we view data governance, you can read our Position Paper Data Governance.
If you are migrating from older API’s of the JoinData platform, at the end of this documentation, there is a bit on the conceptual changes.
Concepts
Before we dive into the details, it is good to define some concepts that are omnipresent in the JoinData platform.
Companies that integrate with our platform can have one or more roles. Those roles are farmer, application provider or data source provider. If you want to consume data, you are an application provider. If you provide data, you are a data source provider. This documentation is meant for application providers or data source providers.
If you are a solution provider (building software for some third party), your customer will have one or more of these roles. In this documentation, we assume that you act on behalf of your customer.
Mandates
JoinData’s full service is aimed at transporting farmer’s data from a data source provider to an application provider. Such a configuration of transport of data is called a data flow. The transport can only happen if the appropriate parties give consent to do so. This consent is called a mandate.
A mandate consist of three parties: the authorizing company (typically the farmer) whose data the data flow concerns. The providing company (typically JoinData representing all connected data sources, or specific data source providers) is the party storing the data on the authorizing company’s behalf. And the authorized company is the one receiving that data (in most cases the application provider). A mandate further contains a data set for which the consent is given. A data set is a collection of data types that are functionally related. You can find the definitions of our data sets in our Developer Documentation: Data Sets. A mandate contains many more things such as restrictions and meta-data but the companies involved and data sets are the most important for now. For more detail, see the section on JoinData’s Architecture.
Typically, you group together all required mandates into a purpose (see below for the Purpose Registry). You can ask for very specific or for very broad data sets, depending on your use case. You can also ask for specific data sources or ask simply for all available data sources. Read more on this in Developer Documentation: Using the Purpose Registry.
Data Categories
Data transported is divided into 4 categories. Each category requires different parties to give consent.
Data Category | Consent required | Description |
Raw data | n/a | Raw data is data as directly produced by a device, not yet interpreted. E.g. a raw electrical signal. Typically, this data is not useful for a farmer. This type of data is not transported by JoinData. |
Free data | Farmer | Free data is data that can be freely used by a farmer. It is data specifically related to a farm, or at least a farm is identifyable in that data. As such, the farmer needs to give consent. |
Licensed data | Farmer, Data Source | Licensed data is data that is related to a farm. So, a farmer needs to give consent. However, there may be some quality constraints, unpaid IP, or other reasons why the data source requires some control over this data. That is why licensed data also requires a mandate between the data source and the application provider. |
Aggregated data | Data Source | Aggregated data is data that is no longer related to a single, specific farm. Since a farm is no longer identifyable, only the data source needs to give consent. |
Note that each time data is transformed, a new data type is created. And that data type may be of a different data category. So, a farmer may give consent to have his free weather station data shared with an app provider. That app provider may transform that data into a weather forecast for a broader area, and offer that data as a data source provider as aggregated data, as long as this fits within the purpose binding for which the app provider has requested the original weather station data.
Services
The JoinData platform offers a number of services.
Integrating with JoinData’s platform is not an all-or-nothing decision. This allows you to keep existing implementations as you want, or to use a phased migration to JoinData’s services.
A data flow requires a mandate, but you can store mandates without having the data flow via JoinData.
We will briefly discuss the following services:
- The Identification and Authentication Service (for authenticating farmers and applications);
- The Data Hub that allows for secured data exchange;
- The Purpose Registry for storing purposes and mandates as well as onboarding farmer consent;
- The Company Mapping Service that offers a transparent way to store and validate farmer identifiers;
- The Source Registry for administrating which data source exist.
Our latest API documentation can be found online, links are provided in the documentation. You can also often find not yet available API’s in the documentation. “API’s Planned” are API’s that are currently in active development. “API’s Proposed” are under discussion. These descriptions are provided so that you can plan accordingly, but as they are not in production, things may change.
Identification and Authentication Service
All our services require a security token to use. Our Identification and Authentication Service must be used to retrieve a token.
We use the commonly used OAuth2 with OpenID Connect protocol for that. Your application needs a client account to be able to access our API’s. During the onboarding you will receive accounts for our integration environment. When you are ready to go to production, you will receive the production account.
This account typically is an authorization code client, which allows you to call our API’s on behalf of a farmer. Using the OAuth2 authorization code flow, your application allows a farmer to login and retrieve a token specifically for your application. Using that token you can then call our API’s. We use an off-the-shelf identity broker that we have integrated with a.o. the Dutch eHerkenning. eHerkenning is commonly used by farmers (and other enterpreneurs) in the Netherlands. This provides us (and optionally our customers) a way to authenticate natural persons and see if they are authorised to make (datahub) decisions for a company (KVK).
Other authentication mechanisms, such as the eIDAS family of national (European) authentication mechanisms will become available soon.
If your application does not interact with a farmer, e.g. you are building a backend system that aggregates over many farms, you may ask for service-client-credentials. This allows for server2server communication without interaction with the farmer.
You can also use our authentication service to authenticate farmers for your own application. If you need a way to login farmers, you can use our authentication service as a trusted source.
Data Hub
The data hub offers applications and data sources a way to exchange information based on standards, with the guarantee that the proper mandates are in place. Thus; data can only be exchanged via the Data Hub if the proper parties give consent for that exchange. The mandates (which represent the consent) are stored (along with their purpose) in the Purpose Registry (see below).
Typical interaction is a pull-pull system, whereby the application pulls information from the datahub, and the datahub pulls it directly from a source. To do so, it performs the necessary authentication of the application and the appropriate mandate checks. Other interactions may also be possible (e.g. datasources push to a temporary storage from where an application can pull, and a full push channel for time sensitive notifications and infrequent updates).
The latest API documentation can be found here: https://integration.join-data.net/api/docs#broker-api. There are also older protocols and messages available, please contact JoinData if you are interested in those.
Purpose Registry
The Purpose Registry is a database where we store mandates of parties, along with the reason why that mandate is given (the purpose).
This database is accessible via a REST API: you can create, change or remove mandates, ask parties for consent and check if a mandate exist for a specific data flow. Changes on mandates can be notified to your own system, via WebHooks.
If you have data flows that do not make use of the Data Hub, for example if you retrieve farmer’s data from or share farmer’s data with third parties directly, you probably need to verify if the farmer has given consent. You can store those mandates in our Purpose Registry and use the API’s to check the existance of consent. (You may need an extension to your license to store those mandates). This way, you can still provide farmers with a centralised view on all their mandates and you can benefit from the onboarding flow of JoinData.
The latest API documentation can be found here: https://integration.join-data.net/api/docs#purpose-api.
To ask for consent of a farmer, we have a mobile and desktop onboarding process. This process allows an easy and safe way to onboard farmers and get consent (mandates) for the exchange of data. This service makes use of our Identification and Authentication Service. Mandates can be in many forms, depending on the parties involved in that mandate. This onboarding page can be embedded in your mobile app or in your website using a simple javascript client. See Developer Documentation: Using the Onboarding Client for more information.
Some of our customers have an old database of mandates, often containing a digital twin of a paper contract. If you have such a database and you want digitalise this with a strong audit trail you can still import them into the Purpose Registry. If there is no digital audit trail for those mandates, they will be imported at a low trust level. A farmer can then reaffirm the mandate via our service, creating a strong digital audit trail.
Company Mapping Service
We have a company mapping registry which aids in resolving different identifiers used to point to devices and farmers. It offers a way to store, validate and retrieve which identifiers belong to which companies. Each party involved in data exchanges can store its own references to their customers. Again, trust levels are used to make transparent how strong those proofs are. Typical use cases are to find the legal entity (company) of e.g. a tank id, customer number location, to see if there is a signed mandate for exchanging data for that tank, customer or location.
Data flows that make use of the Data Hub use the Company Mapping Service to validate (and when necessary, transform) identifiers used in the meta-data of some messages.
If you have data exchanges outside of the Data Hub, you can use the Company Mapping Service to link different sources together. For example, one source may use tank-id’s to reference to a farm while another uses an internal customer number. The Company Mapping Service may provide you with the information linking these identifiers together.
If you want to read mappings other than your own, you need consent of the farmer.
For more information, see: https://companymapping-api.st.vaa.com/index.html or read the Developer Documentation: Using the Company Mapping Service.
Source Registry Service
In the Source Registry, we keep track of which data sources are available where. It also contains the dictionary for the data sets. You can automatically register new data sources at our platform. You can also query the available data sets here.
For more information, see https://integration.join-data.net/api/docs#sources-api or read the Developer Documentation: Using the Sources API.
JoinData’s architecture
Entity Relation Diagram
In the diagram below you can find the main entities as used in JoinData. This gives an overview of the different concepts, their relations and which service governs those entities.
A mandate describes consent to share data. A mandate consist of three parties: the authorizing company (typically the farmer) whose data the data flow concerns. The providing company (typically the data source provider) is the party storing the data on the authorizing company’s behalf. And the authorized company is the one receiving that data (in most cases the application provider). A mandate further contains a data set for which the consent is given. A data set is a collection of data types that are functionally related. You can find the definitions of our data sets in our Data Catalog. Each data set is either free, licensed or aggregated: the data category. The data category indicates which parties need to give consent for that data flow (the farmer, the data source provider or both). E.g. for licensed data, a second mandate is required (with the data source provider in the role of authorizing company as well as the providing company). A mandate can also contain further restrictions on date ranges, locations etc. A location is typically a farm, stable or other physical location as owned by the authorizing company.
Data source providers are the parties who have data to produce. Typically they are the providing company in a mandate. They can be configured to have one or more data sources; this can be a cloud environment where we can collect data for multiple locations, or a single sensor or other instance of data. These data sources provide one or more data types, the kind of data they can produce.
To ask for consent (to create a mandate), an authorized company need to create a purpose. A purpose is the reason why the authorized company needs the data, e.g. for a specific application, a project or for being able to deliver its services. It contains the data sets that the purpose would require. These purposes are labelled with a purpose category so that a farmer can easily recognise them. If a farmer joins the purpose (e.g. starts using the app or joins a project), a participation of that farmer is created. For each data set in the purpose, a mandate is then created.
API Conventions
errors | ||
id | A unique identifier for this particular occurrence of the problem | |
status | The HTTP status code applicable to this problem, expressed as a string value | |
code | An application-specific error code, expressed as a string value | |
title | A short, human-readable summary of the problem that SHOULD NOT change from occurrence to occurrence of the problem, except for purposes of localization. As such, do not depend on it. | |
detail | A human-readable explanation specific to this occurrence of the problem. Like title, this field’s value can be localized. | |
meta | Additional information that is error specific. |
Conceptual Changes
Whenever the DataHub receives a query, it needs to know whether that is allowed and which source may have data for that query. The Purpose Registry checks if a query is allowed, the Sources Registry checks who may have the answer.
In the early days of JoinData, mandates were more like configured routes: a mandate allowed the transport of a single data type from a single source to a single target. It represented a ‘paper contract’ somewhere in the ecosystem. It also contained any identifiers that were required to make the transport’s data correlate between source and target. With the arrival of the Purpose Registry, mandates are fully digital and more flexible: no longer is there a need to specify a mandate for any source that may contain data the target is interested in. Instead, you can configure a mandate for one or more data sets (a collection of data types) and without specifying a source. The main advantages of using data sets instead of data types are that a mandate can be specified a bit wider so that they are less prone to changes (e.g. when new, similar, data sets arrive) and that they are less technical and thus more recognisable by a farmer. Advantages of moving the identifiers out of the mandates is that company changes now become easier: these change are more local. No longer does an application provider need to know the identifiers used at multiple sources. It also becomes more secure: we can now better manage and track identifiers and who is authorised to change then.
This does mean that whenever a query comes in, the mandate can no longer be relied on to provide all the details. More specifically, two parts are missing:
- identifiers: if a query comes in for a certain location or object, how is that object known at the data source? This is solved by the Company Mapping Service which exists to manage those identifiers.
- availability: which sources may have an answer to that query? This is the Sources Registry’s domain.