We assume you're here because you're not interested in BERTS, CLIPs or T5 encoder/decoder transformers.
You'd like a quick guide to generating some quality images, so let's try to get you there with this walkthrough on writing better prompts.
To write effective prompts it's essential to understand how to structure your prompts, use specific keywords, and customize the desired style and outcome. Diffusion models relies heavily on well-crafted prompts to produce high-quality images.
Our first piece of advice is to pay attention to the order of your terms, and to use commas. The order of the words in the prompt directly corresponds to their weight when generating the final image, so a main subject should always be near the start of the prompt. If we want to add more details, using commas helps separate the terms for the model to read. It needs this punctuation to understand where clauses start and stop.
In our experience, there is a noticeable tradeoff between amount of detail in the prompt the corresponding amount of detail in the image, and the resulting quality of scene composition. More words seems to translate to higher output accuracy, but at the expense of the inclusion of more objects or traits for the diffuser to generate on top of the original subject.
Example simple prompt:
"A forest at sunrise."
Example descriptive prompt:
"In the heart of an ancient forest, the first light of dawn filters through the dense canopy, casting a golden glow on the dewy moss-covered ground. Tall, towering trees, their bark rough and weathered, stand like silent sentinels as a soft mist curls around their roots. The air is crisp and filled with the earthy scent of pine needles, and the distant call of a waking bird echoes through the tranquil morning."
Set the advanced settings as (Aspect Ratio: landscape, CFG:1, Steps:20, Seed: 1648909284378535)
Getting text to appear in your images can be tricky. Print text by adding quotation marks around your desired text in the prompt, and by deliberately writing out the type of text you would like to see appear.
This excellent medium post explains in somewhat simple terms the effect of CFG(Classifier Free Guidance) on a diffusion model's output. In simple terms, the higher the CFG value, the more strictly the model sticks to the prompt, the lower the value, the more "creative" the model gets with your promps. For example, a CFG value of 0, means the model will ignore your prompt completely, and generate whatever it wants. Both fun and spooky we say. Another fantastic recommended video for easy viewing How Diffusers work
This is the number of iterations to run before returning the output. Be careful, going too high here can tip the model over the edge into nonsensical output, too little, and your output won't be as well defined.