Create a generation task

POST

/task

Start to generate a new image.

Request Body^required

Create a generation task with parameters.

object

parameters

required

The schema fully defines the parameters allowed for the image generation tasks we currently support. You can learn about the specific functions of some parameters by searching for the term “stable diffusion glossary.”

object

prompts

What you want to see in the generated image. The prompt is a short sentence or a few words that describe the content of the image.

We support partial “attention” or “emphasis” in prompts with round brackets like A1111 WebUI or ComfyUI. But Square brackets and curly braces are not supported.

string

<= 4096 characters

a cat sitting on a chair

enableTile

Whether to apply ControlNet Tile for image upscaling. Only available when using both the upscale and mediaId field.

boolean

mediaId

The image you input when using i2i. The mediaId and mediaUrl can only pass in one.

string

mediaUrl

The image you input when using i2i. The mediaId and mediaUrl can only pass in one.

string

negativePrompts

The negative prompts are used to guide the model to avoid generating certain content. The negative prompts are a short sentence or a few words that describe the content you do not want to see in the generated image. Usually, even if you don’t pass this parameter, we will provide some common parameters.

string

<= 4096 characters

nsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry

promptHelper

object

withStage

Enable built-in prompt helper

boolean

samplingSteps

The number of steps to sample from the model. The higher the value the more specific the picture will be. It also means higher time and cost. Each model will have its own default values. This value is usually between 20 and 25.

integer

>= 1 <= 50

samplingMethod

The method used to sample from the model. Each model will have its own default values. You can learn about the parameter values we support through our documentation.

string

cfgScale

The Classifier-Free Guidance (CFG) scale controls how closely the AI follows your prompts. Also, when the scale is low, AI tends to produce softer, painterly pictures. We will not strictly limit the value of this parameter. But we recommend keeping the scale below 7.

number

default: 6 <= 100

seed

Any of:

number
unknown

number

modelId

The id of the model version in model market. You can find the URL of the model version.

string

upscale

This field will use Hires Fix to upscale your final image.

number

upscaleSampler

Sampling method used for Hires Fix phase.

string

<= 100 characters

upscaler

Method used for upscaling the image before applying diffusion model to it.

string

<= 100 characters

upscaleDenoisingStrength

Strength of the denoising process in the Hires fix phase.

number

upscaleDenoisingSteps

Sampling steps of the Hires fix phase.

number

enlarge

This field is used to control the multiple of enlargement.

number

>= 1 <= 100

enlargeModel

This field control what model to use to enlarge the image.

string

<= 64 characters

width

When you are using text to image this field control the size of the result image.

integer

>= 1 <= 10240

height

When you are using text to image this field control the size of the result image.

integer

>= 1 <= 10240

strength

The strength field allow you to specify how much the existing picture should be altered to look like a different one. At maximum strength, you will get pictures with the Variation seed, at minimum - pictures with the original Seed

number

controlNets

ControlNet is a term that could refer to various concepts depending on the context.

Array<object>

<= 10 items

object

weight

number

mediaId

The reference image id for the control net.

string

<= 20 characters

mediaUrl

string

type

The type of the control net. Currently, we support the following types: dwpose, openpose_full, canny, depth, hed, mlsd, openpose, seg, normal, scribble. You can learn the details of each type in our generation panel.

string

lora

LoRA in the context of stable diffusion is a machine learning technique for fine-tuning generative models to adjust their outputs without extensive retraining. It allows for efficient model customization and control over the generated content. The format of the lora field is a JSON object. The key is the version id of the lora model in the model market. The value is the weight of the lora to be applied. The weight is a float number between 0 and 1.

object

<= 10 properties

key

additional properties

number

{
  "1744880666293972790": 0.7
}

latentCouple

LatentCouple is a technology to determine the region of the latent space that reflects your sub-prompts. You can reference the original project to learn the details.

object

type

The type of the latent couple. The value can be “rect”.

string

divisions

The number of divisions in the latent space.

Array<string>

positions

The positions of the latent couple.

Array<string>

weights

The weights of the latent couple.

Array<number>

maskMediaId

The mask image you input when using i2i with inpainting. The maskMediaId and maskMediaUrl can only pass in one.

string

maskMediaUrl

The mask image you input when using i2i with inpainting. The maskMediaId and maskMediaUrl can only pass in one.

string

batchSize

The number of images to generate in one task.

number

>= 1 <= 4

enableADetailer

Whether to apply the After-Detailer to the image for face fixing.

boolean

clipSkip

Specify the number of last layers of CLIP model to stop at

number

vaeModelId

The id of the VAE model version in model market. VAE models can help you adjust the saturations and coloring for your image. Explore with our available options to enhance your images.

string

workflow

object

workflowName

You can use this field to specify which workflow you want to execute. The format is {username}/{workflowUniqueId}:{versionName}.

string

someone/ipa:latest

inputs

The inputs field is used to call the workflow with the specified inputs.

object

<= 100 properties

key

additional properties

any

ipAdapter

IP-adapter (Image Prompt adapter) is a Stable Diffusion add-on for using images as prompts. You can use it to copy the style, composition, or a face in the reference image.

object

enabled

boolean

referenceImages

The reference images media id list.

Array<string>

<= 10 items

callbackUrl

The URL to receive the task status update. The callback URL should be a public URL that can be accessed by our server. We will send a POST request to the callback URL with the task status update.

string

Responses

200

Successful operation

object

required

The unique identifier for the task.

string

status

required

The status of the task.

string

Allowed values: waiting running completed cancelled failed

createdAt

required

The time the task was created.

string format: date-time

updatedAt

required

The time the task was last updated.

string format: date-time

outputs

The outputs of the task.

object

mediaIds

The media IDs generated by the task. You can use these IDs to fetch detailed information about the generated media.

The images you generate are usually not permanently retained. You need to retrieve your images as soon as possible.

Array<string>

mediaUrls

The public URL of the images generated by the task.

The images you generate are usually not permanently retained. You need to retrieve your images as soon as possible. If an image is not available, it MAY be replaced by null in the array.

Array<string | null>

Headers

X-Trace-Id

string

A unique identifier for the request. This is useful for debugging and tracing requests.

422

Validation exception

Headers

X-Trace-Id

string

A unique identifier for the request. This is useful for debugging and tracing requests.

429

Too many requests

Headers

X-Trace-Id

string

A unique identifier for the request. This is useful for debugging and tracing requests.

Create a generation task

Request Body required

Responses

200

Headers

422

Headers

429

Headers

Request Body^required