Stability.ai Diffusion 1.0 image to image (masking)
The Stability.ai Diffusion 1.0 model has the following inference parameters and model response for using masks with image to image inference calls.
Request and Response
The request body is passed in the body
field of a request to
InvokeModel or InvokeModelWithResponseStream.
For more information,
see https://platform.stability.ai/docs/api-reference#tag/v1generation/operation/masking
- Request
-
The Stability.ai Diffusion 1.0 model has the following inference parameters for an image to image (masking) inference call.
{ "text_prompts": [ { "text": string, "weight": float } ], "init_image" : string , "mask_source" : string, "mask_image" : string, "cfg_scale": float, "clip_guidance_preset": string, "sampler": string, "samples" : int, "seed": int, "steps": int, "style_preset": string, "extras" : json object }
The following are required parameters.
-
text_prompt – (Required) An array of text prompts to use for generation. Each element is a JSON object that contains a prompt and a weight for the prompt.
text – The prompt that you want to pass to the model.
Minimum Maximum 0
2000
-
weight – (Optional) The weight that the model should apply to the prompt. A value that is less than zero declares a negative prompt. Use a negative prompt to tell the model to avoid certain concepts. The default value for
weight
is one.
-
init_image – (Required) The base64 encoded image that you want to use to initialize the diffusion process.
-
mask_source – (Required) Determines where to source the mask from. Possible values are:
-
MASK_IMAGE_WHITE – Use the white pixels of the mask image in
mask_image
as the mask. White pixels are replaced and black pixels are left unchanged. -
MASK_IMAGE_BLACK – Use the black pixels of the mask image in
mask_image
as the mask. Black pixels are replaced and white pixels are left unchanged. -
INIT_IMAGE_ALPHA – Use the alpha channel of the image in
init_image
as the mask, Fully transparent pixels are replaced and fully opaque pixels are left unchanged.
-
-
mask_image – (Required) The base64 encoded mask image that you want to use as a mask for the source image in
init_image
. Must be the same dimensions as the source image. Use themask_source
option to specify which pixels should be replaced.
The following are optional parameters.
-
cfg_scale – (Optional) Determines how much the final image portrays the prompt. Use a lower number to increase randomness in the generation.
Default Minimum Maximum 7
0
35
-
clip_guidance_preset – (Optional) Enum:
FAST_BLUE, FAST_GREEN, NONE, SIMPLE, SLOW, SLOWER, SLOWEST
. -
sampler – (Optional) The sampler to use for the diffusion process. If this value is omitted, the model automatically selects an appropriate sampler for you.
Enum:
DDIM, DDPM, K_DPMPP_2M, K_DPMPP_2S_ANCESTRAL, K_DPM_2, K_DPM_2_ANCESTRAL, K_EULER, K_EULER_ANCESTRAL, K_HEUN K_LMS
. -
samples – (Optional) The number of image to generate. Currently Amazon Bedrock supports generating one image. If you supply a value for
samples
, the value must be one.generates
Default Minimum Maximum 1
1
1
-
seed – (Optional) The seed determines the initial noise setting. Use the same seed and the same settings as a previous run to allow inference to create a similar image. If you don't set this value, or the value is 0, it is set as a random number.
Default Minimum Maximum 0
0
4294967295
-
steps – (Optional) Generation step determines how many times the image is sampled. More steps can result in a more accurate result.
Default Minimum Maximum 30
10
50
-
style_preset – (Optional) A style preset that guides the image model towards a particular style. This list of style presets is subject to change.
Enum:
3d-model, analog-film, animé, cinematic, comic-book, digital-art, enhance, fantasy-art, isometric, line-art, low-poly, modeling-compound, neon-punk, origami, photographic, pixel-art, tile-texture
-
extras – (Optional) Extra parameters passed to the engine. Use with caution. These parameters are used for in-development or experimental features and might change without warning.
-
- Response
-
The Stability.ai Diffusion 1.0 model returns the following fields for a text to image inference call.
{ "result": string, "artifacts": [ { "seed": int, "base64": string, "finishReason": string } ] }
result – The result of the operation. If successful, the response is
success
.-
artifacts – An array of images, one for each requested image.
seed – The value of the seed used to generate the image.
-
base64 – The base64 encoded image that the model generated.
-
finishedReason – The result of the image generation process. Valid values are:
SUCCESS – The image generation process succeeded.
ERROR – An error occurred.
CONTENT_FILTERED – The content filter filtered the image and the image might be blurred.