TechTonic Shifts SundAI, your weekly overdose of artificial intelligence news: week 51

Welcome back to SundAI !

This week in AI was a dream on steroids, laced with cocaine and topped-up with amphetamines, it was faster, bigger, and weirder than ever. DeepMind decided to shine with their new Gemini Flash 2.0 playthingy, and a mind-bending video generator, and OpenAI continued its “12 Days of Overhyped Features”, now with video input in ChatGPT. And, Microsoft’s quietly snuck Phi-4 q into the mix of models, with which it proved that smaller models can punch like heavyweights. And just when you thought things couldn’t get more unreal, them folks at Google announced their quantum computing chip, Willow, which makes a septillion-year taste feel like a quick walk in the park.

So, plug in your VPN, strap-on your Mixnet (Hi Aleksey), and let’s get into this chaotic AI week 51.

Let’s rock ‘n roll.

TL;DR for the ones with attention issues

DeepMind’s Gemini Flash 2.0 leveled up the multimodal game, proving that smaller, faster models are the future of AI efficiency.
OpenAI added video features to ChatGPT Advanced Voice Mode, making sure their chatbot can now do everything except wash your dishes.
Microsoft’s Phi-4 showed off with its STEM-savvy reasoning, outperforming some of the biggest players in AI, all while keeping it compact and efficient.
Google’s Willow quantum chip casually solved problems older than the universe in under five minutes—no big deal.
Cohere, Pika Labs, and Apple also threw their hats into the ring with updates ranging from NLP models to generative emojis.

Shorter than short: AI’s end-of-year sprint is progressing full speed, and it is impossible to keep up without a second (or third) cup of coffee.

More after the commercial brake:

Comment, or share the article; that will really help spread the word 🙌
Connect with me on Linkedin 🙏
Subscribe to TechTonic Shifts to get your daily dose of tech 📰

Lalalala, blablablaaaa hmm hhmm hmm gave to me, two blablablabla, three blablablibla…..and a pear treeee?

I couln’t find the name of this Christmas rhyme, so if anyone can help out, you win a custom made Dall-E picture of me.

But what I do know about the holidays is that it is a time for giving, receiving, and… getting scammed

Scammers have upped their game this year, and they are using AI to ram fake ads up your…, launch sketchy shopping sites, and send your mom or dad masses of texts that steal their credit card details. Banks are already reporting a spike in fraud, so maybe double-check that “50% off designer lingerie” ad before clicking.

The AI job market is through the roof

Oh, here’s a feel-good one. The AI job market is set to explode in 2025. Companies are apparently scrambling to hire folks who actually know how to make these models work. So first they ditch all their engineers, and now they double back to rehire them, who have reskilled and resupplied themselves, and are asking twice as much. Machine learning, AI implementation, and transformation roles are in high demand, but finding the right qualified candidates is harder than getting access to Sora on launch day. Non-tech companies are jumping on the AI bandwagon too, so if you’ve got skills, 2025 will be your year.

Gemini Flash 2.0 and Veo 2: DeepMind’s answer to OpenAI

DeepMind was not fooling around this week. Gemini Flash 2.0 just went on stage, and it’s kicking the butts of its predecessors quite hard. Flash 2.0 is a leaner, meaner cousin of Gemini 1.0, but with better efficiency and faster inference than it’s competitor’s oversized models. This new baby flushed benchmarks like it was faxing a turd to Putin, with scores like:

MMMU Image Understanding: 70.7% (up from 59.4% last year).
MMLU Pro: 76.4%, because who needs subtlety when you can outscore the competition?

And when you thought that DeepMind would take a breather, they launched their new V2 missile: Veo 2 into the stratosphere. **** Veo 2 is a text-to-video model that is generating 4K video that is so real that it might trick your eyes into thinking that you’re living in a simulation (by the way, there’s more research on the simulation front as well, but that’s for another time). Veo 2 even simulates physics, which makes it perfect for anyone needing realistic videos of calculating bullet trajectories or whatever else you concoct in your fever dreams.

Source

Oh, before I forget, the company also announced Deep Research. That’s a tool for researching complex topics within Gemini advanced. This for me, is a good reason to ditch unnecessary subscriptions and buy into this pearl of a jewel of a gem.

Google’s Trillium AI accelerator chip

Aaaaand Google also introduced the Trillium AI accelerator chip this week. It’s kinda like Willow’s but less flashy but ridiculously efficient. It has been assaulting benchmarks like an addict on narcan, and it is making Google look like the Usain Bolt of AI hardware.

OpenAI keeps the hype alive with ChatGPT video features

OpenAI continued with its 12 Days of AI Christmas with a ChatGPT Advanced Voice Mode update that adds video input and screen sharing. Basically, your chatbot can now act like a FaceTime buddy, except that this one identifies objects in your background and judges your IKEA bookshelf. It is of course exclusive to Plus and Pro users for now, but this feature is another notch in OpenAI’s plan to make ChatGPT your digital everything.

Source

Oh, yeah, let’s not forget OpenAI’s o3 Model Release

OpenAI decided it had enough with its 12 Days of AI Christmas and without any publicity, they launched o3, and that is the sequel to o1 – if you did not learn how to count in grade school. o3 is apparently better at math, science, and step-by-step problem-solving. A bit like everyone else, besides me, I guess. It is designed to tackle complex tasks that will make Google Gemini sweat.

Oh, this is an overview of what they launched during their 12 days of Xmas (hope Elon doesn’t pick up on this pun, else he’ll steal Christmas as well, like the grinch he is):

ChatGPT Pro and o1 release:
Sora video generation model:
Canvas development tool:
Apple Intelligence integration:
Advanced voice mode, now with video & santa mode:
Projects in ChatGPT:
ChatGPT search:
Holiday treats for developers:
1-800-CHATGPT:
Work with apps:
Early access for safety testing:
Finale:

And a dystopian future in a pear tree….

Phi-4: Microsoft’s welterweight belt model punches above its division

Microsoft’s Phi-4 is proof that size doesn’t always matter.

Uhhh…

Continuing… It has (just shy of) 14 billion parameters, and with it, this model is smoking bigger competitors like GPT-4 on math and science tasks and it still has some room left for dessert. Phi-4 is coming soon to HuggingFace as well, so get ready to see what happens when you mix synthetic data with a dash of machine learning.

And yes, it was fully trained with synthehol! (Star Trek pun, ya dweeps)

Source

Apple’s Siri gets an umbilical to ChatGPT

Apple finally decided that Siri needed a walking cane and screwed a bit of ChatGPT integration in it. If you were so bold enough to run the latest iOS update, AND you do NOT live in the EU or China, you now get Siri to dabble in generative AI. And oh yeah, Apple has finally joined the big league of genAI players when it introduced Genmoji (custom AI emojis) and Image Playground. At least they’re showing up to the AI party: Apple Intelligence is late to the AI party and brought us… a new set of emoji’s

Source

Google spits qbits in the face of researchers with Willow

Want a quantum dominatrix? Rent Willow for a couple of hours and she will run rings around you, 10 reptillian years in 5 minutes, although I think a gag ball is much cheaper.

Google’s quantum chip Willow is making every other computer on the planet look like a dried up snail. This chip completed a task in 5 minutes that would take a supercomputer 10 septillion years. I even had to look it up, and that is apparently longer than the universe has existed. The implications are huge, but let’s be honest, the real question is: how soon until someone uses this for meme videos. Google tears a hole in the fabric of space-time and proves we live in a multiverse where everything still sucks

Source

Grok Is now free for all X abusers

Elon Musk just made Grok free for X’s non-premium users. Yeeeeey. I played with it when it was not free, and I could not see any value in it. Just another FOMO product. But X is still suffocating the life out of you with their 10 messages every two hours cap. It has got limitations, and this model is as mediocre as you can get, but with Grok going wide, Elon’s chatbot might finally have a shot at stealing a teensy weensy market share from ChatGPT. Whether it’s enough to compete with OpenAI, Google, and Microsoft remains to be seen. I think not. But hey, that’s why you have courts, aight…Welcome to the soap opera called “Open”-AI, starring Sam, Elon, and Mark, with heckling by Statler and Waldorf.

Source

SoftBank’s $100 billion investment in …. AI

Da fuck, was my first reaction when I read this in my RSS feed. You know $10 billion investment isn’t cool anymore, but blowing the bank with $100 billion is. SoftBank’s CEO Masayoshi Son announced this commitment to invest $100 billion in U.S. (alas) tech. It is for the most part AI-focused, and it is spanning about four years. It is said (not me!) that this mega-investment will create 100,000 jobs, so if you’re in the States, it’s time to digitize that paper CV and start dreaming big.

The hottest AI News you weren’t looking for

Google’s Gemini 2.0 Flash: Multimodal, multilingual, and built for agentic applications. Basically, this model wants to be the Swiss Army knife of AI.
OpenAI’s Sora goes Remi: Know Remi – alone in the world? Look it up! OpenAIght created text-to-video in 1080p glory with Sora, it’s newest standalone product, and watermarks included.
Cohere command R7B: Cool name, and also the fastest, smallest member of Cohere’s lineup, and that is perfect for enterprises that don’t want to break the bank on massive models.
OpenAI Projects: ChatGPT gets folders! Finally! Now you can organize…chats? and also files, if you don’t have a file system on your computer.
Pika Labs 2.0: AI video production just got a tad sharper, with new scene Ingredients for better storytelling. It is said that this rivals Sora. I have tried it. Skip it.

Why should you even care about all these lame developments?

DeepMind’s Gemini Flash 2.0 is kind of a game-changer, but not because it is breaking benchmarks (again), but because it’s doing so on smaller, faster models that cost less to run. Less energy and less water means more time for us to breath clean air. And lower latency and higher efficiency also mean that these models are primed for real-time, agentic applications, which is the next hype of 2025. Yeeey, ****AI assistants on your phone, browser, and wearables, as if I don’t have enough shit to worry about.

Oh, before I forget, OpenAI’s and Microsoft’s updates means that they are emphasizing a tad more on pesky things like accessibility and usability. The competition is warming up for a second round of agentic fun in 2024, and that will mean more innovation at breakneck speed.

Reads/vids to pretend you’re working, while secretly dozing off

The epic history of LLMs: A movie about how we got from simple RNNs to the ChatGPTs of today. As if we haven’t all seen a video about it the last few years.
Multimodal RAG applications: With this video, you learn how to retrieve both text and images using vector stores. One modality is for loosers.
How to actually build useful AI products: Good question, and the best read out there. Forget integrating AI in toilet paper to revolutionize wiping your butt. This article lays out what makes AI products actually useful.
Run Gemini via OpenAI API: Yes, Google’s model works with OpenAI’s framework, and here’s the code to prove it.
AI Tooling for Software Engineers in 2024: It is actually quite a usefull reality check on which tools are working, which ones are failing, and what’s just hype, and all in the grand effort to replace developers and creatives with zeroes and ones.

Repositories, tools, copy .

MarkItDown: Convert files to Markdown like a pro.
HunyuanVideo: A framework for large-scale video generation.
DeepSeek-VL2: Vision-language models with MoE magic.
TEN Agent: Your new conversational AI buddy, integrating Gemini’s Live API.
Loveable. A generative AI development tool to help you go from Minimal Viable Product to Minimal Loveable Product. No coding skills required. Only English.

Research papers of the week

Phi-4 technical report
ReFT: Representation Finetuning for Language Models
Training Large Language Models to reason in a continuous impotent space
GenEx: Generating an explorable world
FlashAttention on a napkin:

Quick Links

So, that’s it for this week. It was yet another wild ride in the AI rollercoaster. See you next week, if the bots haven’t taken over TTS by then!

Signing off from the trenches of AI, where Siri pretends to help, Willow runs the show, and supercomputers cry in a corner,

Marco

Well, that’s a wrap for today. Tomorrow, I’ll have a fresh episode of TechTonic Shifts for you. If you enjoy my writing and want to support my work, feel free to buy me a coffee ♨️

Think a friend would enjoy this too? Share the newsletter and let them join the conversation. Google appreciates your likes by making my articles available to more readers.

SundAI, your weekly overdose of artificial intelligence news: week 51

Welcome back to SundAI !

TL;DR for the ones with attention issues

More after the commercial brake:

Lalalala, blablablaaaa hmm hhmm hmm gave to me, two blablablabla, three blablablibla…..and a pear treeee?

The AI job market is through the roof

Gemini Flash 2.0 and Veo 2: DeepMind’s answer to OpenAI

OpenAI keeps the hype alive with ChatGPT video features

Phi-4: Microsoft’s welterweight belt model punches above its division

Apple’s Siri gets an umbilical to ChatGPT

Google spits qbits in the face of researchers with Willow

Grok Is now free for all X abusers

SoftBank’s $100 billion investment in …. AI

The hottest AI News you weren’t looking for

Why should you even care about all these lame developments?

Reads/vids to pretend you’re working, while secretly dozing off

Repositories, tools, copy .

Research papers of the week

Quick Links

To keep you doomscrolling 👇

Become an AI Expert !

Sign up to receive insider articles in your inbox, every week.

✔️ We scour 75+ sources daily

✔️ Read by CEO, Scientists, Business Owners, and more

✔️ Join thousands of subscribers

✔️ No clickbait - 100% free

Like this:

Related

Leave a ReplyCancel reply

Welcome back to SundAI !

TL;DR for the ones with attention issues

More after the commercial brake:

Lalalala, blablablaaaa hmm hhmm hmm gave to me, two blablablabla, three blablablibla…..and a pear treeee?

The AI job market is through the roof

Gemini Flash 2.0 and Veo 2: DeepMind’s answer to OpenAI

OpenAI keeps the hype alive with ChatGPT video features

Phi-4: Microsoft’s welterweight belt model punches above its division

Apple’s Siri gets an umbilical to ChatGPT

Google spits qbits in the face of researchers with Willow

Grok Is now free for all X abusers

SoftBank’s $100 billion investment in …. AI

The hottest AI News you weren’t looking for

Why should you even care about all these lame developments?

Reads/vids to pretend you’re working, while secretly dozing off

Repositories, tools, copy .

Research papers of the week

Quick Links

To keep you doomscrolling 👇

Become an AI Expert !

Sign up to receive insider articles in your inbox, every week.

✔️ We scour 75+ sources daily

✔️ Read by CEO, Scientists, Business Owners, and more

✔️ Join thousands of subscribers

✔️ No clickbait - 100% free

Share this smut:

Like this:

Related

Leave a ReplyCancel reply

Discover more from TechTonic Shifts