Skip to content

Create a generation task

POST
/task

Start to generate a new image.

Request Body required

Create a generation task with parameters.

object
parameters
required

The schema fully defines the parameters allowed for the image generation tasks we currently support. You can learn about the specific functions of some parameters by searching for the term “stable diffusion glossary.”

object
prompts

What you want to see in the generated image. The prompt is a short sentence or a few words that describe the content of the image.

We support partial “attention” or “emphasis” in prompts with round brackets like A1111 WebUI or ComfyUI. But Square brackets and curly braces are not supported.

string
<= 4096 characters
a cat sitting on a chair
enableTile

Whether to apply ControlNet Tile for image upscaling. Only available when using both the upscale and mediaId field.

boolean
mediaId

The image you input when using i2i. The mediaId and mediaUrl can only pass in one.

string
mediaUrl

The image you input when using i2i. The mediaId and mediaUrl can only pass in one.

string
negativePrompts

The negative prompts are used to guide the model to avoid generating certain content. The negative prompts are a short sentence or a few words that describe the content you do not want to see in the generated image. Usually, even if you don’t pass this parameter, we will provide some common parameters.

string
<= 4096 characters
nsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry
samplingSteps

The number of steps to sample from the model. The higher the value the more specific the picture will be. It also means higher time and cost. Each model will have its own default values. This value is usually between 20 and 25.

integer
>= 1 <= 50
samplingMethod

The method used to sample from the model. Each model will have its own default values. You can learn about the parameter values we support through our documentation.

string
cfgScale

The Classifier-Free Guidance (CFG) scale controls how closely the AI follows your prompts. Also, when the scale is low, AI tends to produce softer, painterly pictures. We will not strictly limit the value of this parameter. But we recommend keeping the scale below 7.

number
default: 6 <= 100
seed
Any of:
number
modelId

The id of the model version in model market. You can find the URL of the model version.

string
upscale

This field will use Hires Fix to upscale your final image.

number
upscaleSampler

Sampling method used for Hires Fix phase.

string
<= 100 characters
upscaler

Method used for upscaling the image before applying diffusion model to it.

string
<= 100 characters
upscaleDenoisingStrength

Strength of the denoising process in the Hires fix phase.

number
upscaleDenoisingSteps

Sampling steps of the Hires fix phase.

number
enlarge

This field is used to control the multiple of enlargement.

number
>= 1 <= 100
enlargeModel

This field control what model to use to enlarge the image.

string
<= 64 characters
width

When you are using text to image this field control the size of the result image.

integer
>= 1 <= 10240
height

When you are using text to image this field control the size of the result image.

integer
>= 1 <= 10240
strength

The strength field allow you to specify how much the existing picture should be altered to look like a different one. At maximum strength, you will get pictures with the Variation seed, at minimum - pictures with the original Seed

number
controlNets

ControlNet is a term that could refer to various concepts depending on the context.

Array<object>
<= 10 items
object
weight
number
mediaId

The reference image id for the control net.

string
<= 20 characters
mediaUrl
string
type

The type of the control net. Currently, we support the following types: dwpose, openpose_full, canny, depth, hed, mlsd, openpose, seg, normal, scribble. You can learn the details of each type in our generation panel.

string
lora

LoRA in the context of stable diffusion is a machine learning technique for fine-tuning generative models to adjust their outputs without extensive retraining. It allows for efficient model customization and control over the generated content. The format of the lora field is a JSON object. The key is the version id of the lora model in the model market. The value is the weight of the lora to be applied. The weight is a float number between 0 and 1.

object
<= 10 properties
key
additional properties
number
{
"1744880666293972790": 0.7
}
latentCouple

LatentCouple is a technology to determine the region of the latent space that reflects your sub-prompts. You can reference the original project to learn the details.

object
type

The type of the latent couple. The value can be “rect”.

string
divisions

The number of divisions in the latent space.

Array<string>
positions

The positions of the latent couple.

Array<string>
weights

The weights of the latent couple.

Array<number>
maskMediaId

The mask image you input when using i2i with inpainting. The maskMediaId and maskMediaUrl can only pass in one.

string
maskMediaUrl

The mask image you input when using i2i with inpainting. The maskMediaId and maskMediaUrl can only pass in one.

string
batchSize

The number of images to generate in one task.

number
>= 1 <= 4
enableADetailer

Whether to apply the After-Detailer to the image for face fixing.

boolean
clipSkip

Specify the number of last layers of CLIP model to stop at

number
vaeModelId

The id of the VAE model version in model market. VAE models can help you adjust the saturations and coloring for your image. Explore with our available options to enhance your images.

string
workflow
object
workflowName

You can use this field to specify which workflow you want to execute. The format is {username}/{workflowUniqueId}:{versionName}.

string
someone/ipa:latest
inputs

The inputs field is used to call the workflow with the specified inputs.

object
<= 100 properties
key
additional properties
any
ipAdapter

IP-adapter (Image Prompt adapter) is a Stable Diffusion add-on for using images as prompts. You can use it to copy the style, composition, or a face in the reference image.

object
enabled
boolean
referenceImages

The reference images media id list.

Array<string>
<= 10 items
callbackUrl

The URL to receive the task status update. The callback URL should be a public URL that can be accessed by our server. We will send a POST request to the callback URL with the task status update.

string

Responses

200

Successful operation

object
id
required

The unique identifier for the task.

string
status
required

The status of the task.

string
Allowed values: waiting running completed cancelled failed
createdAt
required

The time the task was created.

string format: date-time
updatedAt
required

The time the task was last updated.

string format: date-time
outputs

The outputs of the task.

object
mediaIds

The media IDs generated by the task. You can use these IDs to fetch detailed information about the generated media.

The images you generate are usually not permanently retained. You need to retrieve your images as soon as possible.

Array<string>
mediaUrls

The public URL of the images generated by the task.

The images you generate are usually not permanently retained. You need to retrieve your images as soon as possible. If an image is not available, it MAY be replaced by null in the array.

Array<string | null>

Headers

X-Trace-Id
string

A unique identifier for the request. This is useful for debugging and tracing requests.

422

Validation exception

Headers

X-Trace-Id
string

A unique identifier for the request. This is useful for debugging and tracing requests.

429

Too many requests

Headers

X-Trace-Id
string

A unique identifier for the request. This is useful for debugging and tracing requests.