magn418M to ChatBotsNSFWEnglish · 1 year ago

Beginner questions thread

NSFW

6

3

Beginner questions thread

NSFW

magn418M to ChatBotsNSFWEnglish · 1 year ago

6

Ask your questions.

Chat

magn418OPM
link
fedilink
English
arrow-up
2·
edit-2
1 year ago
The LLMs use a lot of memory. So if you’re doing inference on a GPU you’re going to want one with enough VRAM. Like 16GB or 24GB. I heard lots of people like the NVidia 3090 Ti because that graphics card could(/can?) be bought used for a good price for something that has 24GB of VRAM. The 4060 Ti has 16GB of VRAM and (I think) is the newest generation. And AFAIK the 4090 is the newest consumer / gaming GPU with 24GB of VRAM. All the gaming performance of those cards isn’t really the deciding factor, the somewhat newer models all do. It’s mostly the amount of VRAM on them that is important for AI. (And pay attention, a NVidia card with the same model name can have variants with different amounts of VRAM.)

I think the 7B / 13B parameter models run fine on a 16GB GPU. But at around 30B parameters, the 16GB aren’t enough anymore. The software will start “offloading” layers to the CPU and it’ll get slow. With a 24GB card you can still load quantized models with that parameter count.

(And their professional equipment dedicated to AI includes cards with 40GB or 48GB or 80GB. But that’s not sold for gaming and also really expensive.)

Here is a VRAM calculator:

https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator

You can also buy an AMD graphics card in that range. But most of the machine learning stuff is designed around NVidia and their CUDA toolkit. So with AMD’s ROCm you’ll have to do some extra work and it’s probably not that smooth to get everything running. And there are less tutorials and people around with that setup. But NVidia sometimes is a pain on Linux. If that’s of concern, have a look at RoCm and AMD before blindly buying NVidia.

With some video cards you can also put more than one into a computer, combine them and thus have more VRAM to run larger models.

The CPU doesn’t really matter too much in those scenarios, since the computation is done on the graphics card. But if you also want to do gaming on the machine, you should consider getting a proper CPU for that. And you want at least the amount of VRAM in RAM. So probably 32GB. But RAM is cheap anyways.

The Apple M2 and M3 are also liked by the llama.cpp community for their excellent speed. You could also get a MacBook or iMac. But buy one with enough RAM, 32GB or more.

It all depends on what you want to do with it, what size of models you want to run, how much you’re willing to quantize them. And your budget.

If you’re new to the hobby, I’d recommend trying it first. For example kobold.cpp and text-generation-webui with the llama.cpp backend (and a few others) can do inference on CPU (or CPU plus some of it on GPU). You can load a model on your current PC with that and see if you like it. Get a feeling what kind of models you prefer and their size. It won’t be very fast, but it’ll do. Lots of people try chatbots and don’t really like them. Or it’s too complicated for them to set it up. Or you’re like me and figure out you don’t mind waiting a bit for the response and your current PC is still somewhat fine.
- raffa
  link
  fedilink
  English
  arrow-up
  2·
  1 year ago
  Thanks for the detailed reply! I need to upgrade my hardware this year anyway (it is 6+ years old).

ChatBotsNSFWNSFW

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

This community is for lewd AI-generated text and the tools to generate it.

Feel free to share and discuss:

erotic roleplay
storywriting
AI companions
tools and software
character cards and scenarios
prompts and instructions
LLM models / fine-tunes

Beginner guide and Resources

For generated images there is another community: [email protected] General discussion about LLMs: [email protected]

Please respect LemmyNSFW’s rules
Don’t just dump ultra low quality suff here
Consider sharing your workflow so we can learn something
Mark your text as AI generated and tell us which model you used
You’re encouraged to license your own work for reuse. I suggest CC0 / Public domain. Or CC-BY-SA if you don’t want to give it away completely. Don’t do this with other people’s content.

An example waiver would be:

---
This content was generated by AI. Model used: 
“No Rights Reserved”, CC0: This work has been marked as dedicated to the public domain.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

1 user / day
1 user / week
1 user / month
19 users / 6 months
102 local subscribers
155 subscribers
17 Posts
42 Comments
Modlog

mods:
magn418