Skip to content
×

Setting up an analysis

Analyze dataset

Allows the client to initiate an analysis of a specified dataset.

Request:

The general format is as follows:

POST /api/public/v1/dataset/:datasetId/analyze
{
    outcome: object,
    groupingFeature?: string,
    modelsConsidered?: string,
    featureSelection?: string,
    thoroughness: string,
    coreCount: number,
    maxSignatureSize?: number,
    maxVisualizedSignatureCount?: number,
    name: string,
}

The datasetId must be the identity of a dataset attached to a project to which the user has execute permissions. The outcome specifies both the type of analysis intended, and the dataset feature or features that are to be predicted by the models found by the analysis, see options further below. The optional groupingFeature specifies an Identifier feature that groups samples which must not be split across training and test datasets during analysis, e.g. because they are repeated measurements from the same patient.

The optional modelsConsidered field must be either interpretable or all when present. The default value is all. This parameter controls the types of model considered during the search for the best one. Interpretable models include only models that are easy to interpret such as linear models and decision trees. The optional featureSelection field must be either mostRelevant or mostRelevantOrAll when present. The default value is mostRelevant. This parameter controls the number of features included in models considered. Including only the most relevant ones makes results easier to interpret and faster to compute, with minimal impact on model performance.

The thoroughness field must be one of preliminary, typical, or extensive. This parameter is used to reduce or expand the number of analysis configurations attempted in the search for the best ones; it significantly affects the running time of the analysis. The coreCount parameter must be a positive integer. It specifies the number of compute cores to use during the analysis, and must be at most the number of cores currently available to the user.

The optional maxSignatureSize is the maximum number of features used in a model found by the analysis. When present, it must be a positive integer. When not present, a default value of 25 is used. The maxVisualizedSignatureCount is the maximum number of signatures that will be prepared for visualization in the user interface. When present, it must be a positive integer. When not present, a default value of 5 is used.

The name parameter provides the analysis with a human-readable name for future reference. The name can be at most 120 characters.

Regression analysis

POST /api/public/v1/dataset/:datasetId/analyze
{
    outcome: { regression: string },
    ⋮
}

The regression field must be the name of a numerical feature of the dataset.

Classification analysis

POST /api/public/v1/dataset/:datasetId/analyze
{
    outcome: { classification: string },
    ⋮
}

The classification field must be the name of a categorical feature of the dataset.

Survival analysis

POST /api/public/v1/dataset/:datasetId/analyze
{
    outcome: { survival: { event: string, timeToEvent: string } },
    ⋮
}

The event and timeToEvent fields must be the names of dataset features of those types.

Response:

{ payload: { analysisId: string }, status: "success" }

The analysisId field identifies the new analysis. It provides access to the analysis parameters, the analysis computational task, and, eventually, to the analysis results. The analysis task is performed asynchronously on the server.

Example

$ http --body https://jadapi.jadbio.com/api/public/v1/dataset/57/analyze \
       "Authorization: Bearer $(cat ~/.jadtoken)" \
       outcome:='{"classification":"Diagnosis"}' \
       thoroughness=typical \
       coreCount:=6 \
       name="Classify Alzheimer diagnosis"
{
    "payload": {
        "analysisId": "79",
    },
    "status": "success"
}

Delete analysis

Allows clients to delete a specified analysis.

Request:

POST /api/public/v1/analysis/:analysisId/delete

The analysisId identifies the analysis. It must belong to a project to which the user has write permissions.

Response:

{
    payload: {
        analysisId: string,
        projectId: string,
        parameters: object,
        state: string
    },
    status: "success"
}

The analysisId is the same as in the request. The projectId identifies the project to which the analysis belongs. The parameters object has the same fields as specified when the analysis was created, including the dataset identifier and optional values. The state field will have one of the values pending, running, finished, or failed, reflecting the overall execution progress of the analysis task just before it was deleted.

Example

$ http --body POST https://jadapi.jadbio.com/api/public/v1/analysis/79/delete \
       "Authorization: Bearer $(cat ~/.jadtoken)"
{
    "payload": {
        "analysisId": "79",
        "parameters": {
            "coreCount": 6,
            "datasetId": "57",
            "featureSelection": "mostRelevant",
            "maxSignatureSize": 25,
            "maxVisualizedSignatureCount": 5,
            "modelsConsidered": "all",
            "name": "Classify Alzheimer diagnosis",
            "outcome": { "classification": "Diagnosis" },
            "thoroughness": "typical"
        },
        "projectId": "16",
        "state": "running",
    },
    "status": "success"
}



Note of appreciation to JADBio users

We constantly make changes in the software and do our best to update these materials, but you may notice some differences. We welcome your feedback on how to make this more useful for you and requests for future tutorials.