Open-Source AI. How ‘open’ became a four-letter word

Let’s start with a universal law:

There’s no such thing as a free lunch.

Unless you are an AI model, and in that case some of the most expensive algorithms are being tossed out into the world by the likes of NVIDIA.

Take LLaMA, for example. For the non-initiated (please stay that way – you’ll sleep better), that’s Meta’s multimillion-dollar “open” source model, which is freely available for anyone with a GPU (that’s like a CPU, but with a “G”) and a dream. It makes you wonder if Mark Zuckerberg is trying to score a sainthood badge by doing this. But he ain’t really committed you know. Because LLaMA is not really open – it is just bait!

LLaMA is not truly open-source. That’s just for show. Meta gives you the model weights but not the full training data or their methodology, and its license restricts the use in commercial settings – unless – you get Marc’s seal of approval.

The thing is that models that are derived from LLaMA remain dependent on its architecture. Their strategy is such that they want widespread adoption and maintain control over your AI infrastructure.

This gives Meta influence over the AI ecosystem without direct monopolization.

You see where I’m getting at?

Meta gains regulatory favor, by positioning LLaMA as open. And that weakens closed-source competitors like OpenAI (in theory), and it makes sure that they secure long-term monetization through cloud services, or APIs, and of course (enterprise) licensing.

And then there’s the other one.

OpenAI – you know, the one that has the word “open” in its name – has taken a vow of secrecy.


More rants after the commercial brake:

  1. Comment, or share the article; that will really help spread the word 🙌
  2. Connect with me on Linkedin 🙏
  3. Subscribe to TechTonic Shifts to get your daily dose of tech 📰


Let me explain how their name became “OpenAI”. . . .

OpenAI was founded in 2015 as a nonprofit. At the time they had this lofty goal of making AI research open and accessible to prevent monopolization by major tech firms.

The irony of it.

And of course, the name “OpenAI” reflected its commitment to transparency. And sure enough, the early models I had the pleasure of playing with, like GPT-1, -2 were released openly.

Their belief at the time was that collaboration between researchers and through public access they would be able to make sure that the development of AI remained “ethical and beneficial for humanity”.

Hahahahahaha. . . .

Sorry for that, people.

But reading this in hindsight just makes me want to laugh.

In 2024 their entire ehtics and AI guardrails unit was dissolved. How about that for Ethics and Beneficial for humanity 🙂

So, their AI models grew, and somewhere OpenAI reversed its stance. It may have been due to Elon Musk intervening*, or funding constraints (Bill Gates et. al.) but the reason they put out was that they were afraid of it being misused and because of safety reasons. And by the time GPT-3 launched in 2020, the organization had fully transitioned to a closed model, and they were no longer releasing its model weights or training data.

This shift was largely driven by funding constraints.

As you know, advanced AI research needs a lot of computational resources, and OpenAI needed huge piles of capital to keep pace with competitors like Google and Meta. So, in 2019, it created OpenAI LP, and that’s a for-profit subsidiary under its nonprofit umbrella. And that allowed it to raise investment under the guise of it capping potential profits.

Microsoft then came along, and invested $1 billion initially, and later increased it to $13 billion, and because of which, they secured exclusive access to OpenAI’s technology.

And that final move, basically ended every

Now, OpenAI is moving even further toward commercialization, and Sam has mentioned they wsuggesting it may transition into a fully for-profit entity. So, the name “OpenAI” remains, but its open-source ideals have long been abandoned.

*For more soap-opera on Scam, Skum and Zucky, read: Welcome to the soap opera called “Open”-AI, starring Sam, Elon, and Mark, with heckling by Statler and Waldorf.


What the hell does “Open Source” even mean?

Once upon a time, in the golden age of computing, software was a communal lovefest where nerds shared their code like it was a potluck dinner. Back then, “open source” wasn’t a radical concept—it was just how things were done.

Then capitalism kicked in, and suddenly, software was a product, not a shared good. Copyright laws, licensing fees, and proprietary greed swept in like a tech-bro tsunami, drowning out the kumbaya vibes of early coding communities.

Still, a group of cyber-hippies held the line, clutching onto the sacred belief that knowledge should be free. And from this defiance, we got Linux, the beloved operating system that runs everything from your smart fridge to half the internet.

But let’s not kid ourselves. “Open source” was never about charity—it was about power. Because whoever controls the infrastructure controls the future.

So now, the real question is: Does the spirit of open source still exist in AI, or has it been chewed up and spit out by the machine-learning gold rush?


OpenAI’s ride from idealism to corporate sellout

Once upon a time. . . . OpenAI was the underdog. They were scrappy and scruffy researchers and full of righteous fury. And back in 2015, the Google was the big honcho of AI. That year, they had just swallowed DeepMind, who were the brains behind AlphaGo.

Sam and Elon were together at that time, and they were dreaming of an AI utopia. These two strange bedfellows decided to team up, and launched OpenAI with a grand, almost laughably noble mission:

Make sure AI benefits all of humanity.

Musk bankrolled the whole thing, if you weren’t aware of it. Altman ran the show, and the AI world braced for impact. Well, at least a few did, because AI wasn’t a thing back then.

Oh, and let’s not forget Ilya Sutskever, who then was OpenAI’s resident mad scientist – he now runs the show at his own company which is trying to protect the world against rogue AIs. The man was a protégé of Geoffrey Hinton, of whom you might have heard of before (he is called the “Godfather” of AI minus the killings – and who made significant contributions to the concept of LLMs).

Nerd alert:

🚨 Sutskever is a brilliant lad. He once decided to write an entire programming language from scratch. Just because he felt like it. 🚨

Because they had a crew like this in their line-up, the then “Open”AI was in the race for artificial general intelligence AND they were gunning for the holy grail – super intelligence. . .

Then came The Paper That Changed Everything™:

It was called: Attention Is All You Need. (A paper that @DreesMarc never read, because he has an attention deficit issue. )

While most of the AI world was still rubbing its eyes, Ilya saw the future: Transformers.

Nerd alert:

🚨 When I read this simple word: “Transformers”, I immediately get the sounds in my head that go with it, don’t you? 🚨

The result was GPT-1, then GPT-2. Those were models that were eerily good at mimicking human text that OpenAI freaked out and refused to release it at first. It was a game-changer. But it was also a warning shot – and a lot of input for me to write about.

GPT-1 and -2 were open source, anyone could download them, analyze the architecture and fine-tune them. They had even released the model weights, and also the training code, AND the research papers. I loved them for it and they were the heros back in the day cause it allowed me to fully understand how these models worked.

GPT-2, in particular, was a big deal. OpenAI even hesitated to release it. They feared that it could be misused for generating fake news and automated disinformation. But after months of chit-chat, they published the full model, and suddenly 🎉 KAPOW 🎉, anyone with a decent GPU could run their own state-of-the-art text generator.

I played with it, could dissect it, tweak it, see exactly what made it tick.

Then came GPT-3, and the doors slammed shut. No model weights, no training data, no detailed methodology. OpenAI had flipped the script. Yup. At that time they made the shift from open research to a paywalled API. And only those who could afford access (or partner with Microsoft) could use it.

This was the turning point where OpenAI went from an open-source AI lab to a Silicon Valley corporate giant, and hoarding its breakthroughs behind closed doors and still branding itself as Open.


When musk bailed, OpenAI sold its soul

In the beginning, OpenAI was like a kid. You know, the stuff that us geeks do with powerful new gadgets. They were training AI to beat video games, running fun little experiments.

Then Musk had a classic Musk moment.

He decided he wanted to take over when OpenAI’s research started to make waves. A typical Muskian move. His grand idea was to Absorb OpenAI into Tesla and SpaceX.

Surprise, surprise. This didn’t go over well.

So Musk rage-quit in 2018, yanked his funding, and left OpenAI begging for cash. That’s when Sam Altman did what all good Silicon Valley disruptors do: he sold out in the most creative way possible.

The nonprofit OpenAI split itself in two. He created a for-profit arm who was controlled by the original non-profit. It was like a corporate Russian nesting doll. Except the outer shell was altruism and the inner core was money.

And it worked.

Microsoft waltzed in.

They waved a $1 billion check, and then followed up with a cool $13 billion more. With all this lovely corporate cash, OpenAI rolled out GPT-3, then GPT-4, then ChatGPT, and the rest is history.

But there was a catch to all this fun: no more open-source models. No weights, no architecture, no transparency. OpenAI had gone full black box.

And then, in 2024, Musk being himself did the most warped thing imaginable. He sued OpenAI for not being open enough. He demanded his money back. Cried about the betrayal of the open-source dream.

Was this about ethics?

Or was it because OpenAI was making billions while Musk was stuck playing catch-up?

You decide.

Hahahahaha…

I think we all know the answer. He saw the billions loom on the horizon and wanted to be part of that. And because he wasn’t part of their success anymore, he started whining like a kid whose lollipop got stolen by bullies at school.

I just hate hypocrites.

I think we all do.

Musk, the guy who once wanted OpenAI to keep its most powerful tech secret was now the world’s loudest open-source evangelist.

Why?

Because he was busy cooking up his own AI empire with xAI, which he designed to take on OpenAI head-on. But that never worked. xAI – or Grok is a sorry ass excuse for an AI assistant.

Have you ever played with it yourself?

Thought so!


xAI and Meta. The new pretend “Open”-Source champions

So, what did Musk do after his tantrum? He built Grok. The AI chatbot which he designed to be the “anti-woke” ChatGPT. Alongside that nonsense, xAI launched a 3-trillion parameter model and secured a monstrous $6 billion in funding.

And yes, xAI claims to be open source.

Except… it’s not.

The source code is technically available, but there is a distinct lack of transparency, useful tooling, and small-scale models. In other words, it is open-source in the same way a locked treasure chest is “accessible” if you just happen to have the key.

Kinda like….

Meta waltzed onto the scene.

And Zucky was bullied by his shareholders for his Metaverse hobby which was guzzling up billions after bilions and no mortal cared about his limp avatars.

So he was looking for a new hype to keep the hornets at bay.

And for a time, he looked like the last great hope for AI open-source purity.

To be a little fair to them, Meta’s track record with open-source is actually quite solid. Just look at PyTorch (yes, that’s from Meta). And in 2023, when they launched LLaMA (with 65B parameters), it was a free buffet. The internet went nuts, including moi. And I saw thousands of derivative models spawning in what amounted to an AI gold rush.

But here’s the fine print: Zucky isn’t doing this out of the kindness of his robotic heart.

See, Meta learned a thing or two from his Open Compute Project (his open-sourced design of data centers, hardware etc, to make infrastructure more cost-effective). If you control the infrastructure, you control the future. His strategy was simple. By flooding the market with LLaMA-based models, Meta gets to shape the AI ecosystem while still profiting off the chaos.

Investors love it. Since pivoting from the metaverse to AI, Meta’s stock has doubled.

And, of course, LLaMA’s licensing makes darn ensures that no other Big Tech firm can use their model to train a competing AI. Let’s be real, it’s business first, open-source second.


DeepSeek. The near -open source

By now, every noob and their mothers have heard of DeepSeek. Well, particularly DeepSeek-R1, that is in the same league as OpenAI’s latest toys. This company is different to the others, because it has thrown the public a quite generous bone by releasing the model weights and publishing a technical report about its training methodology. This means that researchers and developers can poke around, fine-tune the model, build on it, rebuild it, and whatnot.

That is…as long as they don’t ask too many questions.

But there’s a catch. DeepSeek pats itself on the back for its “open-source” approach, true, they should. But critical parts of the training process remain locked away. The exact datasets used to train the model is Not disclosed. The full implementation details of how the model was trained? Nope. If you were hoping to fully reproduce DeepSeek’s AI from scratch, you would better start making wild guesses or hope a rogue engineer leaks something useful.

The thing, of course, is why this is happening. DeepSeek didn’t build its AI empire by playing nice, although they would like to have us think otherwise. . .

For starters, they cracked open OpenAI’s latest model, siphoned intel through model distillation, and likely took a deep dive into scraped datasets that would make GDPR regulators weep. And what about their creative workarounds for NVIDIA’s GPU restrictions? They created algorithms that let their model be trained on hardware that shouldn’t even be accessible.

Simply put, the reason they are keeping their full training pipeline under wraps is because it’s built on a patchwork of competitive intelligence, regulatory gray areas, and other hacks that would raise eyebrows in half the world’s AI labs.

If they published every detail, they they would be writing their own indictment.

And yet, this selective openness is precisely what allows them to operate in the shadows still pretend to be the “open-source hero”. They release enough to attract developers, rally an ecosystem, and gain regulatory goodwill, but not so much that anyone could build a real competitor without pulling the same questionable moves they did.

So no, DeepSeek isn’t an open-source revolution. It is just another masterclass in controlled transparency.

But this selective transparency hasn’t stopped the AI community from jumping in. Developers have already started reverse-engineering parts of DeepSeek’s model, and platforms like Hugging Face are working on open-source replications. DeepSeek has still managed to attract attention by positioning itself as the “open” alternative to the increasingly walled-off AI giants. That said, anyone confusing “open weights” with true open-source access is falling for corporate PR gymnastics.


Who and what is really Open Source in AI?

The term “open source” in AI is a minefield, if you didn’t know that already. Most companies love to pretend they are open, but few actually meet the OSI (Open Source Initiative) definition, which requires fully available code, training data, and reproducible results.

Who is actually Open Source?

  • Hugging Face. That is the closest thing to a real open-source champion. Hugging Face provides open-source model repositories, training frameworks (like Transformers and Diffusers), and actively funds community-led replications of major models.
  • Mistral AI. The only real European answer to LLaMA, ChatGPT, DeepSeek, Qwen, … need I go on? It’s a French AI company that has fully open-sourced their models. And that includes Mistral 7B and Mixtral. Both the model weights and the architecture is available. But to be hones. It’s like the not-so-smart kid in class. A bit like Grok (who has the name to match its IQ)
  • Falcon (from the Technology Innovation Institute). They released Falcon-40B. And that is one of the most powerful fully open-source LLMs with no major restrictions (you can download it on Hugging Face)
  • Meta (Kind of…). The LLaMA models are open-weight, which means that you can use them, but not fully open-source since the training data and methodology is hidden. They are more open than OpenAI, but not by much.

So, is Face Hugger really, really open?

Yes, Hugging Face is one of the most open AI platforms, but with caveats. They:

Provide open-source AI tools (Transformers, Diffusers, Tokenizers).

Host community-built open models (Mistral, Falcon, LLaMA).

Fund open-source replications of powerful AI models.

Still rely on closed models for partnerships (e.g., OpenAI, Google).

Cannot control how open the models they host really are.


The harsh reality

No one at the cutting edge is truly 100% open-source. Even Hugging Face is playing within the boundaries that are set by corporations, but they come close. The AI world has largely moved to “open-weight” models instead of fully open-source, and that means that we can use them, but we are never given full control over how they were built.

What is the verdict?

Open-source vs. closed-source isn’t a battle of good vs. evil. Personally, I think that the future of AIs growth (intelligence) lies in Open Source, because I firmly believe in the MoE approach (Mixture of Experts models), which means that for each problem, you have an expert model to handle a reques.

The “battle” is just business strategy dressed up in philosophical nonsense.

OpenAI locked up its models because, without exclusivity, it has no competitive edge.

Meta gives away LLaMA because controlling the ecosystem is the competitive edge.

xAI calls itself open-source, but only when it benefits Musk’s empire.

In short: everyone is playing the same damn game, just with different rules.

And that’s why, if you ask me where I’d place my bets?

Apple

Not because it’s the moral choice, but because it’s the smartest.

They haven’t got a model.

Bet you didn’t expect that outcome, did ya?

Now, go forth, dear reader, and watch as the AI wars rage on. Just remember: the revolution will not be open-sourced.

Signing off from open-source, but only if you beg.

Marco


Well, that’s a wrap for today. Tomorrow, I’ll have a fresh episode of TechTonic Shifts for you. If you enjoy my writing and want to support my work, feel free to buy me a coffee ♨️


Think a friend would enjoy this too? Share the newsletter and let them join the conversation. Google appreciates your likes by making my articles available to more readers.

Articles, relevant to this blog 👇

  1. Welcome to the soap opera called “Open”-AI, starring Sam, Elon, and Mark, with heckling by Statler and Waldorf. | LinkedIn
  2. Open source is key for the future of AI | LinkedIn
  3. The $6 million AI that is making OpenAI nervous (and frankly, me as well) | LinkedIn
  4. Elon Musk gets roasted by his own weak-ass X (and more stuff) | LinkedIn
  5. Apple Intelligence is late to the AI party and brought us… a new set of emoji’s | LinkedIn
  6. I’ve seen the dark side of AI, and you need to know about it | LinkedIn
  7. Build an AI-Assistant in under 10 minutes with Hugging Face | LinkedIn
  8. How to pick the right Foundation Model for your AI project | LinkedIn

Become an AI Expert !

Sign up to receive insider articles in your inbox, every week.

✔️ We scour 75+ sources daily

✔️ Read by CEO, Scientists, Business Owners, and more

✔️ Join thousands of subscribers

✔️ No clickbait - 100% free

We don’t spam! Read our privacy policy for more info.

Leave a Reply

Up ↑

Discover more from TechTonic Shifts

Subscribe now to keep reading and get access to the full archive.

Continue reading