Data Cleaning Endpoints
The Data Cleaning Suite provides a set of endpoints to:
- Authenticate
- Create a Job
- Upload a File
- Update Mappings & Enrichments
- Retrieve the Enriched File
Below is a flow diagram that outlines how to use these endpoints effectively.
Flow Diagram
1. Authenticate
Before using any of the endpoints, you must authenticate. This ensures you have the necessary permissions to access the data.
Example Request
POST /authenticate
2. Create A Job
This endpoint creates a data cleaning job, which acts as a container for the file and subsequent actions. Each job is uniquely identified by an id
.
Example Request
POST /dataCleaning/jobs
Example requestBody
{
"name": "Data Cleaning Job 03-10-20xx"
}
Example Response
{
"id": "f31c786a-1fa8-44d2-8193-c61d77ca2acd",
"name": "Testing From Technical Author",
"createdAt": "2025-02-07T14:07:10.8766667",
"modifiedAt": "2025-02-07T14:07:10.8766667",
"managingUserId": 123456789,
"managingCustomerId": 987654321,
"owningCustomerId": 987654321,
"owningUserId": 123456789,
"status": "created",
"source": "dataCleaning",
"archived": false,
}
3. Upload A Job File
This endpoint uploads the file to be processed. The id
from the job creation step must be passed in the path to associate the file with the job.
The file must be sent as form-data
, and you must specify whether the file includes a header using the hasHeader
property.
Example Request
POST /dataCleaning/jobs/{id}/upload
Example Response
{
"correlationId": "2a7b5537-3950-4903-810d-9814c91d5564",
"id": "9e824d9e-0e77-43ef-1f10-08dd712f5830",
"sourceFilename": "test-file-input.csv",
"hasHeader": true,
"createdAt": "2025-04-03T12:24:32.2266667",
"modifiedAt": "2025-04-03T12:24:32.2266667",
"managingUserId": 123456789,
"managingCustomerId": 987654321,
"status": "uploaded",
"active": true
}
4. Update Mappings
This endpoint maps the columns of the uploaded file to the required fields for matching.
Use the available ENUMs (column headers) to match your file's columns to the Creditsafe database.
NOTE The first column starts at position '0'.
Example Request
PUT /dataCleaning/jobs/{id}/mappings
Example requestBody
[
{
"mapping": "companyId",
"value": "0"
},
{
"mapping": "orgNumber",
"value": "1"
},
{
"mapping": "name",
"value": "2"
}
]
NOTE Ensure your column headers match the available ENUMs as closely as possible. This forms the basis of the matching process.
5. Submit Job
This endpoint submits the file for matching against the Creditsafe database. The job id
must be passed in the path, and an empty request body is required.
Example Request
POST /dataCleaning/jobs/{id}/submit
6. Return Job By Id Number
This endpoint retrieves the current status of the job. It can be used periodically to track progress. The job must reach the jobMatchingComplete
status before proceeding to enrichment.
Example Request
GET /dataCleaning/jobs/{id}
Example Response
{
"id": "f31c786a-1fa8-44d2-8193-c61d77ca2acd",
"name": "Testing From Technical Author",
"createdAt": "2019-08-24T14:15:22Z",
"modifiedAT": "2019-08-24T14:15:22Z",
"managingUserId": 123456789,
"managingCustomerId": 987654321,
"owningCustomerId": 987654321,
"owningUserId": 123456789,
"status": "jobMatchingComplete",
"source": "dataCleaning",
"jobSummary": {
"totalRows": 0,
"matched": 20,
"manualMatched": 0,
"unmatched": 0,
"duplicates": 0
},
"archived": true
}
7. Update Enrichments
This endpoint applies the desired enrichment type to the matched data. Enrichment types include:
- basic
- basicPlus
- standard
Example Request
PUT /dataCleaning/jobs/{id}/enrichments
Example requestBody
It is possible to remove properties not required for enrichment credit type. It is not possible to add additional tags beyond the maximum allowable tags for that credit type
{
"enrichments": [
{
"enrichment": "general.safeNumber"
},
{
"enrichment": "general.connectId"
},
{
"enrichment": "general.ggsId"
},
{
"enrichment": "general.companyName"
},
]
}
NOTE Refer to the API documentation for the full list of allowable enrichments for each type.
8. Start Enrichment
This endpoint submits the request to enrich the matched data. The job id
must be passed in the path, and an empty request body is required.
Example Request
POST /dataCleaning/jobs/{id}/enrich
9. Return Job By Id Number
This endpoint is used to check the status of the submission request, this endpoint may be used multiple times for periodic checks.
It is important to note that the endpoint after this point (Return Enriched File) can not be carried out without the 'Matching' process to reach a status of enrichmentComplete
.
The data cleaning job id
needs to be passed into the path.
Example Request
GET /dataCleaning/jobs/{id}
Example Response
{
"id": "f31c786a-1fa8-44d2-8193-c61d77ca2acd",
"name": "Testing From Technical Author",
"createdAt": "2019-08-24T14:15:22Z",
"modifiedAT": "2019-08-24T14:15:22Z",
"managingUserId": 123456789,
"managingCustomerId": 987654321,
"owningCustomerId": 987654321,
"owningUserId": 123456789,
"status": "enrichmentComplete",
"countryCode": "GB",
"portfolioId": "string",
"source": "dataCleaning",
"jobSummary": {
"totalRows": 0,
"matched": 20,
"manualMatched": 0,
"unmatched": 0,
"duplicates": 0
},
"jobEnrichmentSettings": {
"creditType": "basic"
},
"archived": true
}
10. Return Enriched Job File
This endpoint retrieves the completed, enriched file. The job id
must be passed in the path.
By default, the response is a .csv
file, but if the file contains fewer than 300,000 rows, it can also be returned as .xlsx
.
Example Request
GET /dataCleaning/jobs/{id}/enrichedFile
Example Response
{
"correlationId": "string",
"filePath": "string"
}