> For the complete documentation index, see [llms.txt](https://aisuko.gitbook.io/wiki/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://aisuko.gitbook.io/wiki/ai-techniques/stable-diffusion/depth-maps.md).

# Depth maps

Overview

Depth-to-image (Depth2img) is an under-appreciated model in Stable Diffusion v2. *<mark style="color:red;">**It is an enhancement to image-to-image (img2img)**</mark>* which takes advantage of the depth information when generating new images.

*<mark style="color:red;">**With depth-to-image, you have better control of synthesizing subject and background separately.**</mark>*

In depth-to-image, Stable Diffusion takes an image and a prompt as inputs (similar with image-to-image). The model first estimates the depth map of the input image using [MIDas](https://github.com/isl-org/MiDaS), *<mark style="color:red;">**an AI model developed in 2019 for estimating**</mark>* [*<mark style="color:red;">**monocular depth perception**</mark>*](https://en.wikipedia.org/wiki/Depth_perception) *<mark style="color:red;">**(that is estimating depth from a single view).**</mark>* The depth map is then used by Stable Diffusion as an extra [**conditioning** ](/wiki/ai-techniques/stable-diffusion/conditioning.md)to image generation.

Depth-to-image uses three conditionings to generate a new image

* test prompt
* original image
* depth map

Equipped with the depth map, the model has *some* knowledge of the three-dimensional composition of the scene. **Image generations of foreground objects and the background can be separated.**

## Depth map

*<mark style="color:green;">**A depth map is a simple gray scale image of the same size of the original image encoding the depth information.**</mark>**&#x20;**<mark style="color:red;">**Complete white means the object is closest to you. More black means further away.**</mark>*

Here’s an example of an image and its depth map estimated by MIDaS.

![](/files/JhowGN6Q6UdAvsJnrwUN)![](/files/q7JSthc5ygTIy6gjmjxu)

## What can depth-to-image do

Here is an example of denoising strength for both image-to-image and depth-to-image.

<figure><img src="/files/htZu9vcWCCaf4PsyGslO" alt=""><figcaption><p>Original image</p></figcaption></figure>

<figure><img src="/files/n7RlkhnXBTxiiHoFFJRP" alt=""><figcaption><p>Comparing image-to-image and depth-to-image.</p></figcaption></figure>

Here we can see the image-to-image generations (top row). We ran into a problem: at low denoising strength, the image didn't change enough. At high denoising strength, we do see two wrestlers but the original composition is lost.

*<mark style="color:red;">**Depth-to-image resolves this problem.**</mark>* You can crank up denoising strength all the way to 1 (the maximum) without losing the original composition.

## Some useful points with depth image

### Inpainting

If we care about preserving the original composition

![](/files/vj0x15yOeSDhohEIutFl)![](/files/IZtjctiKAfNe9Zm9ERk8)

### Style transfer

We can dial denoising strength all the way up to 1 without losing composition. That makes transforming a scene to a different style easy.

![](/files/zTBNP7RnquyFQ1TTXzF7)![](/files/Z1IgAcNZjMCTDR5XN20B)![](/files/URv4Yj5tb4P3LwcqCu7Z)![](/files/ewiQJ66jeE9oFjBDeCwU)

## Summary

Depth-to-image is a great alternative to image-to-image, especially when you want to preserve the composition of the scene.

## Credit

{% embed url="<https://stable-diffusion-art.com/depth-to-image/>" %}


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://aisuko.gitbook.io/wiki/ai-techniques/stable-diffusion/depth-maps.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
