language models like chatgpt are specifically trained to accel at these tasks:
- summarization
- extraction
- and classification
all these can usually be done by small language models, even ones running locally on your laptop at reasonable speeds, no gpu required, especially with todays efficient “mixture of experts” (MoE) models.
now HOLD UP. these are already some incredible capabilities! with classification alone, we can do incredible amounts of big data analysis. when combining this classification ability with text extraction (the ability to cite specific parts of a given text corpus), we can already create all sorts of cool things!
let’s imagine an LM powered search engine feature.
in googles case, they added their “AI overview” at the top of the search results, which
- displays a lengthy LM summary of the top search results
- pushes actual results off the screen
- sometimes contains info which wasn’t in the source text
now this is what i call a boring and annoying application of LMs.
let’s design a search engine feature which people might intentionally use:
- a
Bubble Relevant 🫧button.
what does the bubble relevant button do? well, it:
- scans through the first page of search results
- extracts sentences and quotes which fit the search query
- check quotes against source text (so no quotes are made up)
- rerank these extracted quotes by how relevant they are to the search query
- present the quotes to the user with their original sources
here, LM outputs become pointers to actual real sources. how about that?
- worst case: the suggested quotes are useless to you
- best case: the LM finds a quote which fits the exact thing you wanted to look up
i might use that feature sometimes maybe. especially since the environmental burden isn’t as insanely high, due to the use of drastically smaller LMs in use here (about 50 to 200 times smaller).
this isn’t a feature worth investing billions into, but maybe a one which has me pick one search engine over another.
just in case anyone is wondering: hellooo world, this text was written by a real life human. you might want want to believe it, but markdown highlighting can even be used by people-
some stuff i maybe should mention
the “ai” hype has made LMs very polarizing.
training todays largest LMs (also called “LLMs”) costs a lot of money and CO2 emissions, and features like googles “ai overview”, especially when citing hilariously bad sources, the existance of elons mechahitler and the fact that all LMs are trained unethically makes LMs appear useless and dangerous.
others see potential in LMs, pointing to things like AlphaEvolve system finding more efficient ways to multiply matricies, and their Cell 2 Sentences Scale Gemma 27B model finding genuine potential ways to cure certain types of cancers (both of which also stem from alphabet). they hope for more such things down the line.
both sides have genuine arguments here, there really is no clear winner, at least from my point of view.
it makes me not want to ignore that entire discussion, ignore what the “ai bros” are yelling and test things for myself with local models, to see if these models are useful to me. im not sure about that yet.
are language models useful to you?
wanna test local models for yourself?
try installing this free and open source server ollama. it’s available for mac, gnu/linux and windows.
it lets you download and run open-weight language models on your own computer, essentially letting you host your own “OpenAI API” endpoint, alongside a CLI.
after installation, you can download models with ollama pull model-name and test it in the CLI with ollama run model-name. for CPU-only computers, i recommend trying the model granite4:tiny-h and for people with GPUs with over 6GB of VRAM, try qwen3:4b-instruct.
you can search through the huge log of models on their model library
EDIT: added image - cuz for some reason setting a “thumbnail url” doesnt actually display that image as a thumbnail
EDIT 2: you know what? - no. i wont. im removing that image again. it was just some chatgpt image. how boring. this is a mediocre rambling about LMs in search engines, it’s already spicy enough. i put a google logo instead.

