Helpful Resources
I’ll add more here as I remember them. Feel free to add more in the comments.
- AUTOMATIC1111’s Stable Diffusion WebUI is the software nearly everybody uses
- System requirements wise, for a while I used it on a 1050Ti with 4GBs of VRAM. I wouldn’t recommend going any lower than that. An RX580 with 8GB VRAM does wonders at a similar secondhand price point (if there isn’t any crypto hype going around where you are)
- Using https://github.com/pkuliyi2015/multidiffusion-upscaler-for-automatic1111 can provide a really nice speed boost if configured correctly.
- Civitai has a really good selection of models, loras, and other resources
Models
Models are basically the brains of Stable Diffusion. They are the data SD uses to learn what your prompts mean.
The built-in models that come with Stable Diffusion are really bad for porn. Don’t use them. In fact don’t use them at all unless you’re training your own models, there are better SFW models.
Here are some of my personal favourites:
Anime
- MeinaHentai is a great model to start with. Compared to other models it’s really easy to prompt
- AOM3 also does really well, though it might be a little more difficult to guide
For all of those, I recommend installing https://github.com/DominikDoom/a1111-sd-webui-tagcomplete, as they heavily rely on danbooru tags.
- Berry Mix (Pre-mixed version here) can also work pretty well, depending on what you want to do. AFAIK it uses rule34 tags instead of danbooru, so it probably won’t work all too well with prompts used for the above ones
Realistic
- Uber Realistic Porn Merge is the only realistic model I know of that does hardcore stuff. It’s unfortunate problem is that it’s REALLY DAMN HARD TO USE
VAEs
VAEs are mostly used for finetuning colors, sharpness, what have you. Some models come with a VAE builtin, but for ones that don’t, it’s recommended to have one on hand.
- “Anything VAE”, “Orangemix VAE”, and “NAI Leak VAE” are the same exact thing under different names. If you already have one on hand, don’t bother with the others. Most VAEs are renamed versions or modifications of this one.
- Waifu Diffusion’s kl-f8-anime2 is also a pretty good one. It doesn’t require Waifu Diffusion.
- The one that comes with Stable Diffusion is the only one that seems to work for realistic stuff.
LoRAs
LoRAs teach models about concepts (characters, clothing, environments, style, …) they might not know about. There are a LOT of them, so feel free to browse Civitai to find ones you might want.
LoRAs tend to be specific for families of models, or at the very least styles (using anime LoRAs on realistic models tend to be a bad idea), but there are a fair few that will work across the board.
Locon and LyCORIS are newer formats of LoRAs. Not sure on the technical differences between them, but they will not work out of the box and need an extension such as https://github.com/KohakuBlueleaf/a1111-sd-webui-lycoris to get working
Textual Inversions / Embeddings and Hypernetworks
These are mostly obsoleted by LoRAs. There are a few embeddings such as Deep Negative and EasyNegative that are still quite useful, but in most cases you’ll want to use LoRAs instead.
This is a question which bothers me about any AI image generation, but here people are focused on doing humans so I feel it’s a good place to ask as with humans it hits harder:
Let’s assume I download a recommended setup and I’m a total noob sitting down to generate from a prompt. How many misshapen, badly generated disturbing chaos mutants from beyond reality do I need to see before the AIs return a somewhat satisfying result? Is an “unsuccessful image” just a person with a blurred face or somewhat off fingers, or are we talking about full-blown body horror?
Since no one has answered yet I’ll chime in with my experience using the Stablediffusion thing with all the 111’s in it a few months ago.
It was super easy to use. I would set a prompt like “25 year old woman nude at the beach. Blonde hair blue eyes, thin small perky breasts, shaved pubic area” and run usually 10 batches of 5 images each. It would take 5-10 minutes usually, I have a 1080 TI, and I’d probably get 5-10 "“good” images and the rest would be trash. Some would have weird extra arms or be snake people with weird torsos. The most common issue I would have would be nipple and vulva placement, it can be weird sometimes. Not uncommon to have extra or no nipples, or extra breasts. Lots of barbie type pubic areas would show up.
I think part of my issue might have been the HD upscaling I was using, as I would see a quick glance of the initial render but by the time it upscaled it went funky.
I honestly just chocked it up to my lack of technical knowledge and possibly bad prompt writing.
However I do feel it was incredibly easy to generate at least some decent stuff for someone that has no coding experience or anything.
Thanks! The thing which bothers me isn’t difficulty, rather how much of the painful distorted surrealism do I need to experience before getting something decent. You say you got 5-10 good images out of 50, but are the rest utter body horror or just odd limb or (as you write) awkward details? I can add I do freak out at the sight of the famous AI fingers if they are very bad tangled spaghetti of flesh.