this post was submitted on 03 Jun 2024
32 points (86.4% liked)

Ask Lemmy

26831 readers
1435 users here now

A Fediverse community for open-ended, thought provoking questions

Please don't post about US Politics. If you need to do this, try [email protected]


Rules: (interactive)


1) Be nice and; have funDoxxing, trolling, sealioning, racism, and toxicity are not welcomed in AskLemmy. Remember what your mother said: if you can't say something nice, don't say anything at all. In addition, the site-wide Lemmy.world terms of service also apply here. Please familiarize yourself with them


2) All posts must end with a '?'This is sort of like Jeopardy. Please phrase all post titles in the form of a proper question ending with ?


3) No spamPlease do not flood the community with nonsense. Actual suspected spammers will be banned on site. No astroturfing.


4) NSFW is okay, within reasonJust remember to tag posts with either a content warning or a [NSFW] tag. Overtly sexual posts are not allowed, please direct them to either [email protected] or [email protected]. NSFW comments should be restricted to posts tagged [NSFW].


5) This is not a support community.
It is not a place for 'how do I?', type questions. If you have any questions regarding the site itself or would like to report a community, please direct them to Lemmy.world Support or email [email protected]. For other questions check our partnered communities list, or use the search function.


Reminder: The terms of service apply here too.

Partnered Communities:

Tech Support

No Stupid Questions

You Should Know

Reddit

Jokes

Ask Ouija


Logo design credit goes to: tubbadu


founded 1 year ago
MODERATORS
 

There are so many out there, with varying benefits, risks, and ethics. I'd like to know what to recommend when asked, and also what I could use for myself.

Some areas that could be good for discussion:

  • Locally hosted models (both for low and high powered devices)
  • Open source models (also calling out "open source" models that aren't actually open source)
  • Privacy friendly tools/frontends (ex. DuckDuckGo's AI chat for anonymous use of some "free" models)
  • Unified interfaces for multiple models, or 'pay as you go' platforms instead of paying for individual subscriptions
top 11 comments
sorted by: hot top controversial new old
[–] [email protected] 22 points 5 months ago

I run my models on my own hardware. In general, the larger quantized models run better when raw. They are more intuitive and approachable. Almost everything people complain about with AI is because they do not understand how it works in practice. There are many layers of function and capability beneath the surface. If all you use are the small models, like anything under around a 30B, you're likely to find it hard to use. At these sizes the model lacks the comprehension to self diagnose many problems. These models tend to have multiple potential error sources that can occur at the same time. So that can be really frustrating too. If you understand most of the ways models respond in error, it becomes much easier to address issues with the smaller models. The smaller models can be useful and quite capable with specialized training.

Think of it like this: the AI has a small available window of attention it can operate within. (There are multiple spaces where "Attention" has meanings that are different.) That window can view a small part of the surface of information available. You can move the window around to view any section on the surface relatively easy by using a basic prompt with good instructions. However, that is nowhere near what the model really knows. You need to build momentum in the space you're interested in accessing within the model. This is only one of several factors. You also need to know how to talk to a model. This is very different than humans. For instance, my casual grammar and style in the last sentence is useless with AI. I must use proper nouns and think out what I am trying to say differently. Personally I have other methods where I establish who I am, my knowledge and expectations, and then I ask a series of leading questions where I know the answers and can let the AI build the prompt dialogue momentum for me. Then I can ask much deeper questions and get good/useful answers.

The momentum factor is one of the largest differences between the bigger and smaller models. With the larger (30B+) models it is not very hard to build momentum in a space and get deeper into useful territory. With the smaller models you're kinda stuck in the stupid entry level zone at first. It is like a dense underbrush at the edge of a forest where you're in need of a brake to find your footpath to where you want to go. If you have the experience to spot the issues, you can walk right through that dense tangle and find the other side with only minor annoyances and a few thorns. If you want to use the small model for something specific, you can train it yourself on some little niche and this will be like a bridge over the dense thicket and get you into a useful space relative to the training.

By contrast, the larger models have brakes and footpaths all across the edge of the forest. It is still easy to get lost, or on some kind of dead end, but the forest itself is far more self aware and, if asked well, it will be able to help you find your way more effectively and with far less momentum required to get you there.

If the tool you're using does not give you absolute and full control of everything the AI has in the prompt, you're already in trouble. Your past prompts may be fed back into the model with each query. This I'd great for the stalkerware company trying to data mine, as it creates a better and more detailed profile of who you are as a person. However any unrelated information passed to the model at the same time ruins your momentum within the underlying tensor tables of the model.

I use Oobabooga Textgen a lot (GitHub), and with the notebook tab interface. That is more like a text editor where I see everything in the entire prompt. I also have my own Python code that adds features to this interface.

Models come from huggingface.co. I often use a Mixtral 8×7B or a Llama 70B on the large models side. I also use the newer Llama3 8B on the smaller side. The 8×7B is much quicker than the 70B and it is nearly as accurate. However, it lacks some of the advanced self awareness aspects and displays some issues that are common to smaller models.

I'm extremely intuitive and function in abstract thought most of the time. My view of the world is largely that of relativism. I find accuracy to be subjective in all spaces and "facts" as foolish idealism in an absolute sense. I view everything models say as a casual water cooler conversation with an interesting non expert. Nothing said is a primary citation worthy source, but neither is anything said here, yet here we are.

With Oobabooga Textgen, there is a chatGPT compatible API. If you launch Oobabooga from the command line, it only takes adding the "--listen" flag to make Oobabooga available on your home network, and/or "--api" to make the chatGPT API available as well. This works with most third party tools that connect to chatGPT, or so I've read.

If you want to get into more technically capable setups, you need a RAG for document reference look up and retrieval. A couple of RAG options are Ollama and privateGPT, or if you want a basic code interface for Python, langchain and chroma db.

[–] [email protected] 9 points 5 months ago (1 children)

I pay $20 for ChatGPT and it’s money well-spent

[–] [email protected] 1 points 5 months ago* (last edited 5 months ago) (1 children)

Depending on the features and intensity you use, you may save lots of money by self hosting openwebui and connecting to openai via their API.

You can also host openwebui directly on your local PC no need for a server.

Pay per token is much cheaper for me.

[–] [email protected] 1 points 5 months ago

Nope, I’m sticking with money well spent thanks

[–] [email protected] 7 points 5 months ago

I absolutely love Suno, and having it generate songs for me. I’ve addd several of its songs to my actual playlists while I’m driving, and they have become staple memes and love songs between myself and my partner. We even named her crocheted plushies after songs we made about them in Suno, and it named the plushies for us.

[–] [email protected] 6 points 5 months ago* (last edited 5 months ago) (1 children)

I didn't want to promote any paid tool in the post. I'm currently paying for one of the more well known models, but I want to shop around. My plan right now is to find a pay as you go tool, so that I can use different models depending on what task each one is good at.

If I'm going to pay for something, I want to vote with my wallet and go with the option that's better for consumers

[–] [email protected] 4 points 5 months ago

Don't pay for software, pay for hardware which you can run good local models on.

[–] [email protected] 5 points 5 months ago (1 children)

One unified interface I've been enjoying is Stability Matrix, which lets you conveniently switch between different image generation frontends without having duplicates of the model files.

Text generation: LM Studio seems the most user friendly IMO

Voice transcription: I've been using AllTalk TTS, was a little frustrating to set up but works

[–] [email protected] 1 points 5 months ago

I love lm studios interface and how you can search for models but they fucked up their code somehow and it's about twice as long as any other option when it comes to actually generating the text.

[–] [email protected] 2 points 5 months ago

I’ve never used one, though a mate uses chat gtp constantly so I make him ask it things fairly often. However, I’ve just bookmarked DDG, that seems useful. If there’s a similarly private voice assistant for iOS (that works better than Siri) then I’d probably use that in preference to a traditional search engine a lot of the time

[–] [email protected] 1 points 5 months ago

I don’t know if this fits, but the small company I work for as a software developer used CoPilot in Visual Studio Pro and it’s incredible.

Like sure it’s not doing my job for me, but it saves so much time. Plus it will pick up our coding standards and is great for repetitive code blocks and just rubber ducking.