Oh and I typically get 16-20 tok/s running a 32b model on Ollama using Open WebUI. Also I have experienced issues with 4-bit quantization for the K/V cache on some models myself so just FYI
Oh and I typically get 16-20 tok/s running a 32b model on Ollama using Open WebUI. Also I have experienced issues with 4-bit quantization for the K/V cache on some models myself so just FYI
It really depends on how you quantize the model and the K/V cache as well. This is a useful calculator. https://smcleod.net/vram-estimator/ I can comfortably fit most 32b models quantized to 4-bit (usually KVM or IQ4XS) on my 3090’s 24 GB of VRAM with a reasonable context size. If you’re going to be needing a much larger context window to input large documents etc then you’d need to go smaller with the model size (14b, 27b etc) or get a multi GPU set up or something with unified memory and a lot of ram (like the Mac Minis others are mentioning).
Is it possible to use StreetComplete on iOS?
I think we can all agree that modifications to these models which remove censorship and propaganda on behalf of one particular country or party is valuable for the sake of accuracy and impartiality, but reading some of the example responses for the new model I honestly find myself wondering if they haven’t gone a bit further than that by replacing some of the old non-responses and positive portrayals of China and the CPC with a highly critical perspective typified by western governments which are hostile to China (in particular the US). Even the name of the model certainly doesn’t make it sound like neutrality and accuracy is their primary aim here.
I used to daily drive Ubuntu some years ago for work/personal use but have been back on Win 10 primarily for the last 4-5 years. I was considering trying to go back due to how much Windows sucks (despite some proprietary software only being available on it) but remembering the trouble I had with some networking/printer drivers and troubleshooting those issues and then seeing this article Is definitely making me reconsider…
Okay, can I hijack this thread to ask the question from some fellow coffee enthusiasts: Do decaf beans actually tend to suck? I would be interested in a decaf or half-caff blend but curious what the connoisseurs think… and sorry no I haven’t searched for a post on this in the community so feel free to downvote the crap out of me / take mercy on me and link one of it exists…
Not exactly what I was thinking of but still worth a sub!
TBH This might be a good enough idea to merit a whole community of people just posting singular cool screenshots of games they are playing. Could be a cool low-effort visual way to document what folks are into at the moment. Kind of like a visual version of those ‘what are you playing’ weekly threads that used to be everywhere.
Been having fun and happy to see a gameplay balance patch so soon. That said, the technical side of this game is what really needs work and according to everything I’ve been seeing from Digital Foundry and others there’s some serious low hanging fruit that could improve the frame rate and pacing that is still pretty poor on all systems. Hopefully they bring some attention to that side of things soon. Game is certainly playable in the current state in my opinion but would be much more enjoyable if it actually stuck to something close to 60 FPS in most situations on XSX/PS5.
It looks like it will support DLSS3 (frame generation) so if you have a 40 series card (or almost any card with a bit of time to install that DLSS to FSR3 mod) that should be a nice boost.
Wow… didn’t see that coming… /s
AMD only and not Nvidia? That’s what I was seeing based on a quick search. Unfortunately, I don’t have an AMD GPU.
This is impressive and interesting, but what about hardware ray tracing support? Proton has been very impressive but I thought that RT on DX12 was basically non-existent on Linux.
So my grandchildren will be more than likely be belters. Got it.
Here’s a non Google AMP link to the article: https://www.theregister.com/2023/07/01/chiplet_market/
Looks like it now has Docling Content Extraction Support for RAG. Has anyone used Docling much?