Setting up an analysis
Analyze dataset
Allows the client to initiate an analysis of a specified dataset.
Request:
The general format is as follows:
POST /api/public/v1/dataset/:datasetId/analyze
{
outcome: object,
groupingFeature?: string,
modelsConsidered?: string,
featureSelection?: string,
thoroughness: string,
coreCount: number,
maxSignatureSize?: number,
maxVisualizedSignatureCount?: number,
name: string,
}
The datasetId must be the identity of a dataset attached to a project to which the user
has execute permissions. The outcome specifies both the type of analysis intended, and the
dataset feature or features that are to be predicted by the models found by the analysis,
see options further below. The optional groupingFeature specifies an Identifier feature
that groups samples which must not be split across training and test datasets during
analysis, e.g. because they are repeated measurements from the same patient.
The optional modelsConsidered field must be either interpretable or all when present.
The default value is all. This parameter controls the types of model considered during
the search for the best one. Interpretable models include only models that are easy to
interpret such as linear models and decision trees. The optional featureSelection field
must be either mostRelevant or mostRelevantOrAll when present. The default value is
mostRelevant. This parameter controls the number of features included in models considered.
Including only the most relevant ones makes results easier to interpret and faster to
compute, with minimal impact on model performance.
The thoroughness field must be one of preliminary, typical, or extensive. This parameter
is used to reduce or expand the number of analysis configurations attempted in the search
for the best ones; it significantly affects the running time of the analysis. The coreCount
parameter must be a positive integer. It specifies the number of compute cores to use
during the analysis, and must be at most the number of cores currently available to the user.
The optional maxSignatureSize is the maximum number of features used in a model found by
the analysis. When present, it must be a positive integer. When not present, a default
value of 25 is used. The maxVisualizedSignatureCount is the maximum number of signatures
that will be prepared for visualization in the user interface. When present, it must be a
positive integer. When not present, a default value of 5 is used.
The name parameter provides the analysis with a human-readable name for future reference.
The name can be at most 120 characters.
Regression analysis
POST /api/public/v1/dataset/:datasetId/analyze
{
outcome: { regression: string },
⋮
}
The regression field must be the name of a numerical feature of the dataset.
Classification analysis
POST /api/public/v1/dataset/:datasetId/analyze
{
outcome: { classification: string },
⋮
}
The classification field must be the name of a categorical feature of the dataset.
Survival analysis
POST /api/public/v1/dataset/:datasetId/analyze
{
outcome: { survival: { event: string, timeToEvent: string } },
⋮
}
The event and timeToEvent fields must be the names of dataset features of those types.
Response:
{ payload: { analysisId: string }, status: "success" }
The analysisId field identifies the new analysis. It provides access to the analysis
parameters, the analysis computational task, and, eventually, to the analysis results.
The analysis task is performed asynchronously on the server.
Example
$ http --body https://jadapi.jadbio.com/api/public/v1/dataset/57/analyze \
"Authorization: Bearer $(cat ~/.jadtoken)" \
outcome:='{"classification":"Diagnosis"}' \
thoroughness=typical \
coreCount:=6 \
name="Classify Alzheimer diagnosis"
{
"payload": {
"analysisId": "79",
},
"status": "success"
}
Delete analysis
Allows clients to delete a specified analysis.
Request:
POST /api/public/v1/analysis/:analysisId/delete
The analysisId identifies the analysis. It must belong to a project to which the user has write permissions.
Response:
{
payload: {
analysisId: string,
projectId: string,
parameters: object,
state: string
},
status: "success"
}
The analysisId is the same as in the request. The projectId identifies the project to
which the analysis belongs. The parameters object has the same fields as specified when
the analysis was created, including the dataset identifier and optional values. The state
field will have one of the values pending, running, finished, or failed, reflecting the
overall execution progress of the analysis task just before it was deleted.
Example
$ http --body POST https://jadapi.jadbio.com/api/public/v1/analysis/79/delete \
"Authorization: Bearer $(cat ~/.jadtoken)"
{
"payload": {
"analysisId": "79",
"parameters": {
"coreCount": 6,
"datasetId": "57",
"featureSelection": "mostRelevant",
"maxSignatureSize": 25,
"maxVisualizedSignatureCount": 5,
"modelsConsidered": "all",
"name": "Classify Alzheimer diagnosis",
"outcome": { "classification": "Diagnosis" },
"thoroughness": "typical"
},
"projectId": "16",
"state": "running",
},
"status": "success"
}
Note of appreciation to JADBio users
We constantly make changes in the software and do our best to update these materials, but you may notice some differences. We welcome your feedback on how to make this more useful for you and requests for future tutorials.