Your weekly digest of AI developments
01/03
Samsung is revealing its Galaxy AI. Samsung is holding its annual event for launching the Galaxy S series phones on 17th Jan. But the teaser is titled “Galaxy AI is coming”. Will we see on-device AI?
01/09
Duolingo cuts 10% of contractors as it uses more AI to create app content. Duolingo confirms the Reddit leaks (we reported) about this offboarding and says no full-time staff was impacted.🍿Our Summary
01/10
Rabbit launched its AI hardware device called r1. R1 reimagines phones in half its size and with a new OS that doesn’t have apps. It’s the cheapest of the bunch at $199 and shipping starts Easter 2024
01/10
Fast Company did a profile on Avi and Tab announcing a $1.9M funding to replace God with his Tab AI necklace. Avi’s Tab has a different approach from Humane’s Ai pin and Rabbit’s R1. It’s not competing with a phone, instead, it’s about making AI personal to you: your companion, your helper and your guide.🍿Our Summary (also below
01/11
OpenAI’s GPT store is finally live. We all knew that was coming. But what’s OpenAI without surprise releases? We now have a ChatGPT for teams plan at $30/month/user ($25 if annual). GPTs are getting more personalized too. (Not out to everyone yet but hopefully soon!)🍿Our Summary (also below)
01/18
I wrote a deep dive on how non-technical people are using AI to write code. It includes views from others who’ve done this successfully, a tutorial on how you can too, tips to help you learn and breaking down what the learning process could look like.
01/18
Samsung announces its new phones with AI features taking centre stage. Hands-on with these features has some nice things, whereas others are still meh. The AI features are powered by Google’s Gemini models. A few are on-device, others via cloud. Circle to Search is the leading feature where you can circle (or scribble over) anything in images and videos to get focused search results.🍿Our Summary (also below)
01/18
New AI from Google DeepMind called AlphaGeometry nails solving super hard geometry problems. When tested on 30 past Olympiad problems, it solved 25 within the time limits. We're talking problems so tough only math geniuses who compete in the International Mathematical Olympiad can figure them out.🍿Our Summary (also below
01/29
Microsoft released their annual Future of Work report and this time around it’s not about remote work, it’s about AI. Like no one would guess that. The report has stats from many studies done in 2023, backed by theoretical research from past years. I compiled what you want to know in this 🍿busy man’s guide to the future of work with AI.
02/02
Bard's got some cool updates. It now offers Gemini Pro in more countries and can check its answers better, but the biggest of all is image generation. Yup, you can now create images in Bard.🍿Our Summary (also below)
02/02
Amazon's beta testing out Rufus, an AI assistant designed to make shopping on its app smooth. It’s built on unique data from retail stores, it digs into customer reviews, Q&As, and the web to find exactly what you need.🍿Our Summary (also below)
02/05
[LEAK] Bard is rebranding to Gemini + Gemini Advanced release date. I was getting attached to Bard but Gemini’s got that dawg in it.😏 Gemini Advanced (the to-be GPT-4 killer) is to launch on 7th Feb—this Wednesday.Gemini’s also getting an Android app with a potential merge of Google Assistant.Exciting stuff ahead.
02/08
OpenAI is making agent software now. AI agents don’t just chat but actually get stuff done on your computer and online. 🍿Our Summary (also below) Also, Sam Altman tweeted that OpenAI will help in building hardware factories. (first public acknowledgement by him)
02/09
Google’s finally made their best AI model available to the public via their flagship chatbot Gemini. You might remember Bard as Google’s chatbot, but its days are over. Bard is now Gemini with a paid version called Gemini Advanced.🍿Our Summary (also below)
02/09
Sam Altman seeks trillions of dollars to reshape the business of chips and AI. There have been reports of Sam Altman talking to the UAE government about his AI chips and hardware plan. WSJ now reports that the amount could be as big as $5-7T. Sam, playing in an entirely different league.
02/09
QUICK BITESGoogle’s finally made their best AI model available to the public via their flagship chatbot Gemini. You might remember Bard as Google’s chatbot, but its days are over. Bard is now Gemini with a paid version called Gemini Advanced.What is going on here?Google is launching its best AI across their products, starting with a paid chatbot.What does this mean?Let’s take a short trip down the memory lane:In March last year, Google released a chatbot called Bard with their then-best AI model called PaLM to have something out there as people were flocking to ChatGPT. Throughout last year, it kept claiming the next set of models from Google would be state of the art, multimodal from the ground up, yada yada but there wasn’t anything to show for it… up until 6th December (2 months back).They called it Gemini with different classes of models: Nano, Pro and Ultra. Pro performs similarly to GPT 3.5 which powers ChatGPT’s free version and Ultra meets (and sometimes beats) GPT-4, the model being ChatGPT Plus (paid). But again, continuing the tradition of “announcements, not releases”, only Gemini Pro was available for us to use. Where? Primarily via the same chatbot: Bard. But this time Google pinky promised that Ultra would come soon to Bard. And Google came through. But instead of adding Gemini Ultra to Bard, Google renamed the entire brand (big or small, don’t ask me 😉) to Gemini.Back to the present: Bard is now called Gemini. It has now two versions: a free version (same as before) and a paid version, called Gemini Advanced. Gemini Advanced runs on Gemini Ultra 1.0 (that’s just Google letting us know they’re not done done).To highlight Gemini Advanced:Paid at $20/month, with Google One benefits. (Strange decision by Google to not undercut OpenAI in pricing here)Available via the web as well as a standalone mobile app.Two months free for you to get a taste of it.And that’s kinda it. The key change is that Google’s chatbot (the lost Bard, the rising Gemini) got a smarter brain. The infrastructure for it remains the same. To be fair, the Gemini team has been adding features like extensions to YouTube, Maps, Gmail, Workspace etc. and the ability to create images (just like DallE in ChatGPT).Why should I care?ChatGPT Plus has had this monopoly over best performance for almost a year. Gemini Ultra is a GPT-4 class model and since you’re getting it for free, I‘d say try it. It’ll likely have some issues and you might not like it, but you might also find some use cases that Gemini does better than ChatGPT (like maybe working with your emails.)I’m not jumping onto any verdict so soon. It’ll present itself in the next week or so. And ultimately in the next two months, we’ll know if Gemini has brought Google in front of the AI race or if Google is still playing catchup.Share this story
02/15
OpenAI is building a web search product in partnership with Microsoft. Just quoting this tweet I saw: OpenAI’s supposedly got eyes on a smartphone, GPU factories, Agent OS and now web search. Either they are feeding friends at The Information wrong things or they are way too ambitious.
02/16
Google announced a new model. Didn’t they release one last week? That was Gemini Ultra 1.0. Google’s moving ahead to 1.5 model announcements and now we got a peek into Gemini Pro 1.5. This one has a context window of up to 10M tokens—GPT-4 Turbo has 128k.🍿Our Summary (also below)
02/16
OpenAI's mogging everyone in the AI space again. Sora, a new AI model from OpenAI spits out videos based on simple text prompts and we are talking minute-long videos that feel insanely real. Marques Brownlee (MKBHD) posted a YouTube video talking about it. So, what can you dream now, because Sora (sky in Japanese) is the limit.🍿Our Summary (also below)
02/19
Groq (not the Elon one) is serving AI models at insanely fast speeds. The claim is that using their new LPUs (stands for Language Programming Units) is faster than GPUs for serving LLMs. Matt built a demo with Groq’s instant answers, check it out.
02/21
Adobe announces its own “chat with PDF”. Finally, we covered that Adobe’s working on one in early December. It’s called AI assistant in Acrobat and launched in beta for paying users.🍿More deets (also below)
02/22
guess what people are talking about? That its flagship chatbot, Gemini, is not creating pictures of white people. So, we wrote a bit about how 🍿AI is becoming Google’s Midas Touch. (also below)
02/22
Remember Monday’s big news? A big AI company managed to get Reddit’s data for $60M a year. Reuters report that’s also Google.
02/23
Stability AI has announced Stable Diffusion 3. From Twitter sample images, looks like this model can spell correct text and focus on multiple subjects when creating images. Get on the waitlist for an early preview or check more 🍿details + samples. (also below)
02/23
OpenAI is updating its custom GPTs option and its GPT store toward being “the next app store”. You can now rate GPTs, send feedback and see similar info in the About section.🍿Our Summary (also below)
02/26
And, Nvidia crossed $2T in market cap.
02/27
Mistral AI takes on OpenAI - The French startup by former Google and Meta AI researchers has launched its new model, a chatbot along with a partnership with Microsoft.🍿Our Summary (also below)
02/28
Klarna AI assistant handles two-thirds of customer service chats in its first month. That’s 2.3M chats which would’ve needed 700 full-time agents.
03/01
Elon Musk sues OpenAI and Sam Altman for betraying the agreement from Open AI's founding to remain a non-profit company. Elon says they're now closed-source, working with Microsoft to make money instead.🍿Our Summary
03/05
Anthropic's drops a bomb—Claude 3 might be the new AI king. GPT-4 is no longer the lone wolf in the “scary-good AI” valley. Claude 3 is going for the jugular, beating the reigning champ like OpenAI's GPT-4 and taking on newcomers like Google's Gemini.🍿Our Summary (also below)
03/07
Klarna made headlines recently with AI replacing 700 support agents, with the potential of $40M profit gain - I wrote a deep dive on Klarna, what they did, and how it’s impacting jobs, plus a look into their annual report.
03/07
Zapier made two big announcements yesterday. First, Zapier Central - A new workspace where you can build and teach AI bots that work across 6000+ apps. And, Zapier acquired Vowel, an AI-first meetings startup. Zapier’s cofounder (also head of AI) talks about both here. I’m also experimenting with the new workspace.
03/12
Beff Jezos, the founder of the e/acc movement launched his AI startup focusing on making new-age chips for AI. The company is called Extropic and the founders claim that this new chip will unlock exponential efficiency for generative AI. They did a video with Garry Tan, but we tried to 🍿break it down further in simpler language.
03/13
Cognition Labs, a new AI company, released Devin yesterday. Devin is an AI software engineer, i.e. an agent that works autonomously, just like a human software engineer. BIG DEAL! The demos are crazy and the team behind Cognition is crazier.🍿Our Summary (also below)
03/18
Elon Musk stuck to his word and has openly released Grok-1 from xAI. Grok-1 is a massive (314B parameters) mixture-of-experts model. The release is for the base model without any fine-tuning. The performance of Grok-1 lags a lot behind other open LLMs but that size can open up interesting possibilities with continued training.
03/20
Satya Nadella is out there, hedging his AI bets right, left and centre. The latest move from him is getting Inflections AI’s co-founder Mustafa Suleyman to join and head Microsoft’s new AI division.🍿Our Summary (also below)
03/21
Gemini 1.5 Pro is now open to all in Google’s AI studio. It’s soon coming to API as well. This is Google’s model with 1M context length.
03/25
Emad Mostaque resigns as Stability AI’s CEO and leaves its board as well. COO Shan Shan Wong and CTO Christian Laforte are Interim co-CEOs. Emad says it is his own decision because he wants to focus on decentralized AI. No coins tho.
03/26
I’m SO SO SO excited to say, we’ve launched Ben’s Bites 2.0 !! - Tutorials to make AI simple. Previously I founded Makerpad, a no-code education site and now we’re positioning Ben’s Bites as the place to learn how to use AI for work. It’s a one-time fee (for early-bird access), and it includes all tutorials, plus the case studies we’ve been doing for Ben’s Bites Pro (and future ones).If you’re a pro member already, you get access from today, no additional charge, and no future charges either - your payment already counts as your one-time fee.Why are we doing this? Because AI can feel complicated but it doesn’t have to be. We’ve not seen anywhere that offers really simple tutorials on how to use ChatGPT, Claude, Gemini etc to do work. We’ve started with several beginner-level tutorials, we’ll be adding more every week, plus advanced ones soon. We have a tutorial request board you can submit to for us to add to our backlog.Plus we have a Slack community to connect with others using AI at work, sharing workflows and tips. Sign up here
03/27
Airtable AI - Build LLM-powered workflows in Airtable.Zapier Chatbots - Scale your business with AI chatbots.
03/28
Hume AI shocked Twitter with EVI. Empathic Voice Interface (EVI) is a conversational AI with emotional intelligence. It understands your tone of voice and emotions to tune its own language and speech.🍿Our Summary (also below)
03/29
Elon’s xAI is starting to make serious progress. They just announced Grok-1.5, an improvement over Grok-1. Grok 1.5 joins the long-context game with accepting upto 128k tokens. It has better reasoning and code generation too. It will be soon available to early testers and X users.
04/03
SWE-Agent - A new agent framework from Princeton researchers that scores 12.29% on SWE-bench (right behind Devin’s 13.84%).
04/08
AI playlist (beta) by Spotify. Spotify is testing a beta feature to create playlists from prompts in the UK and Australia. It’ll find songs that match your prompt, and you can filter further by adding inputs like “more pop” or “calm vibes”.
04/10
It’s Wrestlemania season for WWE fans but for us AI fans, it’s Modelmania. Just in the last 24 hours, we’ve got updates on three of the largest LLMs.GPT-4 Turbo is out of preview - GPT-4 Turbo (with Vision) is now generally available in Open AI’s API. Vision requests can now also use JSON mode and function calling. OpenAI says that this update also improved the model in maths and reasoning.Gemini 1.5 Pro Now Available. Google’s model with a 1M token context window is now available in Gemini’s developer API. It also accepts audio files—an upgrade from the limited version in AI Studio. More from Google in tools and news below.Mistral dropped another mysterious torrent link. Looks like a 176B MoE model. From the Discord chatter I’ve gone through, it feels like a model better than Sonnet but not as good as GPT-4.Meta also confirmed the news about Llama 3. What news? That smaller versions of Llama 3 are coming soon (as soon as next week). We covered it yesterday.
04/10
Udio - Incredible music, vocals and control over creativity. (Music creation app by ex-Deepmind researchers that just launched).
04/12
Humane AI pin - First AI wearable you can purchase right now (now available to order)
04/15
xAI previewed Grok-1.5V - This new model from Elon’s xAI can now understand images. Grok-1.5V is on par with the vision models from OpenAI, Claude and Google. Let’s see when it comes to users (and maybe open-source).
04/16
Adobe to add generative AI video features to Premiere Pro starting in May. You could add/remove objects, extend clips, and even create custom B-roll footage from scratch. Adobe is also partnering with OpenAI, Runway and Pika to use their models for these features.🍿Our Summary (also below)
04/17
Amazon Music is getting in the AI playlist ring with Maestro. This new tool (currently in beta) lets you generate custom playlists with simple prompts—think text, emojis, and vibes. Spotify launched a similar AI feature recently.🍿Our Summary (also below)
04/19
Meta is bringing Llama 3 into the world. It is starting with open-sourcing two variants of Llama 3 models and an upgraded version of Meta AI. These two variants (8B and 70B parameters) blow the earlier models from Meta out of the water. A larger model with 400B+ parameters is still in training.🍿Our summary of the Llama 3 models (also below)
04/19
These models also power the upgraded Meta AI. Meta AI is Zuck’s answer to ChatGPT - An intelligent AI assistant that’s now on all of Meta’s apps—Instagram, Facebook, Whatsapp and even on the web as “meta.ai”. Zuck claims that Meta AI is the most intelligent free chatbot available publically.🍿Our summary on Meta AI, the chatbot (also below)
04/22
Drake's new song uses AI-generated Tupac and Snoop Dogg vocals. Fans are split between angry ones over using Tupac’s voice and hype ones loving the diss track. One thing is clear, AI music is already mainstream.
04/24
Perplexity, the hot AI search company, is now an unicorn. It just raised another $62.7M and added an enterprise plan. Perplexity’s revenue has grown from $3M to $20M in 6 months (since its last raise).🍿Our Summary (also below)
04/24
Adobe is launching its Firefly Image 3 model in beta. It’s available at firefly.adobe.com and powers new features in Photoshop’s desktop app (beta version).🍿Our Summary (also below)
04/25
Cognition Labs, the startup behind Devin is now valued at $2B. The startup is just 6 months old, and despite some pushback on its demos, investor backing is still strong. Founders Fund led the $175M raise.
04/29
ChatGPT’s memory feature is getting released widely to more users. It means ChatGPT will pick up facts and preferences from your texts to remember (eg: I am vegetarian, or I prefer reading bullet points). The key part is that you don’t have to specify it explicitly. You can turn it on/off in your ChatGPT settings → Personalization.
04/29
The FT is licensing its articles to OpenAI. You'll now see FT content summarized within ChatGPT's responses with links to the source. FT will use OpenAI’s tools for adding AI into journalism.🍿Our Summary (also below)
04/30
There’s a new secret model, gpt2 chatbot, with a GPT-4-Turbo-esque performance. It’s on LMsys, the community platform for rating LLMs. Speculations are going wild from it being GPT-4.5, or GPT-2 with Q* or a fine-tune for agentic reasoning. And no, Sam tweeting about it doesn’t help.
04/30
GitHub just announced Copilot Workspace. It’s a cloud IDE with Copilot-based agents that can plan, write and debug the code for your idea. Get on the waitlist here or check out 🍿Our Summary (also below)
05/02
Rovo by Atlassian - AI assistant to help you turn information into action in a heartbeat.
05/03
OpenAI might be launching a search engine. It’s registering SSL certificates for search.chatgpt.com. The Information reported this in February. Sam Altman also talked about search + LLMs in his latest visit to Lex Fridman’s podcast. The anticipated date for this launch is 9th May.
05/03
We’ve got the first official music video made by OpenAI’s Sora. The music video for the song titled “The Hardest Part” by Washed Out and directed by Paul Trillio is made entirely with clips from Sora. Is it great? nope. Are we going to see a lot more of this? I think yeah.
05/06
Elon Musk will use X and Grok to create real-time AI news. If you’re on Twitter, you must have seen Grok summarizing tweets around a similar topic in “news” format. Elon Musk is doubling down with a new feature called “Stories on X”. These will consider tweets as the ground truth instead of using traditional media articles.
05/09
Google has released AlphaFold 3. It graduates to predicting the 3D structure and interactions of all life's molecules, like DNA, RNA and Ligands (from predicting just proteins with AlphaFold 2). Researchers worldwide can access most of AlphaFold 3's capabilities for free using the new AlphaFold server.
05/10
A leaked deck reveals how OpenAI is pitching publisher partnerships.Reddit lays out a content policy while seeking more licensing deals.
05/13
HubSpot’s co-founder Dharmesh is hacking around with AI agents at agent.ai — You should get on Dharmesh’s waitlist.
05/14
OpenAI wrapped up its spring updates. We have a new model, free intelligence and the glimpses of HER. The new model GPT-4o is smarter, faster and cheaper than GPT-4 Turbo and it’s available for free to everyone. Its voice mode seems similar to Samantha from the movie HER. 🍿Our Summary (also below)
05/15
GOOGLE I/O 2024. Huff!! There’s too much to cover but here are the key themes:Google is integrating AI into all of its ecosystem: Search, Workspace, Android, etc. In true Google fashion, many features are “coming later this year”. If they ship and perform like the demos, Google will get a serious upper hand over OpenAI/Microsoft.All of the AI features across Google products will be powered by Gemini 1.5 Pro. It’s Google’s best model and one of the top models. A new Gemini 1.5 Flash model is also launched, which is faster and much cheaper.Google has ambitious projects in the pipeline. Those include a real-time voice assistant called Astra, a long-form video generator called Veo, plans for end-to-end agents, virtual AI teammates and more.Watch these demos of Astra in action and read 🍿Our Summary (also below)
05/15
OpenAI and Ilya Sutskever part ways. Along with Ilya, Jan Leike, the head of the superalignment team has also resigned. This leaves Open AI without two researchers who made key contributions to the GPT models. Jakub Pachocki is now the Chief Scientist of OpenAI.
05/16
Mike Krieger joins Anthropic as Chief Product Officer. FYI, Mike is the co-founder and ex-CTO of Instagram. Recently, he built Artifact a news-sharing app which got acquired by Yahoo. Anthropic is focusing on building Claude into an everyday app for the workspace.
05/17
OpenAI also announced a partnership with Reddit. OpenAI will get access to Reddit’s data for ChatGPT and “new products”. Reddit will build tools for moderators using OpenAI’s platform. OpenAI is also becoming Reddit’s advertising partner.
05/17
OpenAI adds new features for data analysis in ChatGPT. ChatGPT will soon roll out a major upgrade to its data analysis features, making it easier to add files, work with large datasets, create charts, and gain insights directly within the platform.🍿Our Summary (also below)
05/20
OpenAI has also paused the voice of Sky in ChatGPT’s voice mode—most likely due to its high resemblance to Scarlett Johansson’s voice. OpenAI claims the voice belongs to a voice actor and outlines how the voices in ChatGPT were selected.
05/20
Gemini 1.5 has an updated technical report. The report has good technical details but here’s the crux: Gemini 1.5 Pro is similar to Claude 3 Opus but inferior to GPT- 4o.Gemini 1.5 Flash performs similarly to Claude 3 Sonnet, but faster & cheaper than Haiku.Google has also built two specialized models: Gemini 1.5 Pro for math, which scores 91.9% on the MATH benchmark (insanely high) and Flash 8B which is even smaller than Gemini 1.5 Flash.
05/21
Microsoft has announced new Windows computers called Copilot+ PCs. These laptops are designed to be AI powerhouses with an integrated NPU, 40+ AI models on-device and a new improved Copilot, courtesy of OpenAI’s GPT-4o.🍿Our Summary (also below)
05/21
Scarlett Johansson has issued a statement sharing her shock and anger at the new ChatGPT voice that sounds like her. She claims Sam Altman approached her to voice ChatGPT, but she turned it down. OpenAI developed a similar-sounding voice, disrespecting her consent.🍿Our Summary (also below)
05/22
Mapping the mind of a LLM. You know how AI models are often seen as a black box? Well, Anthropic does this crazy thing of looking inside them to understand what makes them tick. They've extracted millions of features from Claude 3.0 Sonnet, like gender bias, bridges and code errors. 🍿Our Summary (also below)
05/22
Microsoft is expanding Copilot to teams and agents. Copilot can soon act as a team member in meetings and chats. It can also manage projects and track deadlines. Microsoft also gave a sneak peek into what agents that do tasks autonomously will look like in their ecosystem.
05/23
OpenAI has signed a deal with News Corp. News Corp owns publications like WSJ, New York Post, MarketWatch and more. The deal could be worth over $250M over 5 years. The estimates for OpenAI’s previous deals with Axel Springer and FT max out at $10M a year.
05/24
Golden Gate Claude is real. Anthropic has made a fun version of Claude available for a limited time. This one thinks it’s the Golden Gate Bridge and inserts the reference into every conversation. It’s a result of 🍿Anthropic’s new research on interpretability.
05/28
Elon Musk will build a Gigafactory of Compute - Following xAI’s series B of $6B, Elon aims to connect 100,000 of Nvidia's most powerful H100 GPUs, creating a computing powerhouse four times larger than any existing AI cluster. This AI Gigafactory will power Grok and xAI’s other projects.🍿Our Summary (also below)
05/29
Gemini 1.5 Pro ranks #2 on the LMSYS leaderboard with 1268 Elo. Gemini Advanced (Elo: 1267) is just behind it at #3. Google is quickly closing the gap between its models and OpenAI’s GPT-4o (Elo: 1287).
05/30
Scale ranks LLMs with new SEAL Leaderboards to bring some much-needed transparency to LLMs. They’re ranking the top models on maths, coding, ability to follow instruction, and languages. Currently, it’s a tight race between GPT-4 series, Gemini 1.5 and Claude models. 🍿Our Summary (also below)
05/30
OpenAI’s news deals are on a roll. Just yesterday, they added Vox Media and The Atlantic to their pool of news partners. OpenAI is also partnering with WAN-IFRA to start an accelerator that’ll help newsrooms fast-track their AI adoption.
05/30
ChatGPT Free users can now access most paid features—including web browsing, vision, data analysis, file uploads, and GPTs (no image generation though).
06/03
Google's new AI Overviews have been turning up some interesting, and sometimes questionable results. Some are real, many fake. Google is owning up to some early hiccups while explaining how AI overviews “don’t hallucinate” and early fixes.🍿Our Summary (also below)
06/05
Remember Microsoft's shiny new AI tool, "Recall"? It remembers your browsing history and laptop activity and allows you to search/ask questions over it. Recall does that by taking screenshots every 5 seconds and that’s where it gets problematic. The way Recall stores your data is a potential security nightmare.🍿Our Summary (also below).
06/07
A recap of what’s going with AI and China:Kling by KWAI is throwing hands with OpenAI’s Sora. It creates 2-minute long videos with impressive consistency.Qwen2 has been released. A powerful family of models ranging from 0.5B parameters to 72B parameters.China’s Nvidia loophole - The Information reports that ByteDance bypasses US sanctions by renting GPUs from Oracle.
06/11
For the first hour of WWDC, I only heard intelligence three times. In the next 40 mins, I couldn’t keep count as Apple rebranded “Artificial Intelligence” to “Apple Intelligence”. OpenAI partnership is minor, and Apple’s got some tricks up its sleeve. Here’s more on 🍿What exactly is Apple Intelligence? (also below)
06/11
OpenAI‘s got a new CFO and CPO - Kevin Weil (ex-VP of Product at Instagram) is joining as Chief Product Officer and Sarah Friar (ex-CEO NextDoor, ex-CFO Square) is joining as Chief Financial Officer.
06/12
“LLMs are not the way to AGI.” There’s a large camp of AI researchers who believe this to be true. But what to do about it? François Chollet, the creator of Keras and Mike Knoop, co-founder of Zapier are launching a $1M challenge to find alternative ways to get to “general intelligence”. Hear ’em talk on Dwarkesh’s Podcast and the No Priors Pod or… you can read 🍿Our Summary (also below).
06/12
Model Personalization in Midjourney - Tune the MJ algorithm to your own personal tastes (demo).
06/13
Luma Labs releases Dream Machine, a video generation model that you can use right now (looking at you, OpenAI, give us Sora). It generates 120 frames in 2 minutes with character consistency and decent physics. This makes Luma Labs one of the few companies, alongside RunwayML and Pika Labs, with usable video generation models.🍿Our Summary (also below)
06/14
Microsoft will delay the release of AI Recall. Microsoft's new AI Recall feature has received a lot of flak over privacy and security concerns in the past couple of weeks. It was supposed to come built-in with the new Copilot+PCs coming next week.
06/17
Anthropic is opening limited access to an experimental Steering API for Claude. This will allow you to tune a subset of Claude's internal features—just like the Golden Gate demo we got a few weeks ago.
06/18
Runway just announced Gen-3 Alpha, their latest and greatest video generation model. And boy, does it look slick! Say hello to a new era of high-quality, ultra-controllable AI videos. Oh, not so soon—it’s not public yet.🍿Our Summary (also below)
06/18
Google Deepmind generates audio for silent videos. Google Deepmind's video-to-audio (V2A) tech can create rich, synchronized audio for AI-generated videos using just the video pixels and a text prompt for additional guidance. Think it’s time for AI video to move on from awkward mime acts.🍿Our Summary (also below)
06/18
Sound Effect by Eleven Labs - Imagine a sound and bring it to life. The text-to-sound capability is available in their API now. They also made a cool video → sound effect app. Check the docs or try the app.
06/18
Adobe upgrades Acrobat AI chatbot to add multi-document analysis and image generation.
06/19
Microsoft’s Copilot+PCs are available now, starting from $999. (without the AI recall feature).
06/20
Ilya Sutskever Returns. Move over AGI, Safe Superintelligence Inc. is the name of the new game Ilya is playing. He’s got two Daniels as partners: Daniel Gross and Daniel Levy. This new company, SSI, will focus on just one thing: safe superintelligence with no other commercial products.🍿Our Summary (also below)
06/20
Nvidia is now the highest-valued public company in the world. With a current market cap of $3.34T, it has taken over Microsoft ($3.31T). At the same time, an analysis from the Financial Times claims most other stocks are on a cooldown. (PS: not any longer. It is currently #2.)
06/21
Anthropic released a new model outta nowhere and Claude 3.5 Sonnet is now the best LLM out there. Period. It’s more intelligent and faster than Claude 3 Opus, GPT-4o and Gemini 1.5 Pro. It’s available at 4-5x cheaper than all of those models.🍿Our Summary (also below)
06/21
Claude.ai - Anthropic’s chatbot has a new feature called Artifacts and it’s insane. It’s similar to ChatGPT’s code interpreter—it writes code and creates interactable previews (but doesn’t run it like a code interpreter). You can create diagrams, games and one-off apps that play sound too.
06/21
Anthropic released a new model outta nowhere and Claude 3.5 Sonnet is now the best LLM out there. Period. It’s more intelligent and faster than Claude 3 Opus, GPT-4o and Gemini 1.5 Pro. It’s available at 4-5x cheaper than all of those models.What's going on here?Anthropic just dropped Claude 3.5 Sonnet, the first release in their upcoming 3.5 model fam. And let's just say, it's raising the bar big time.What does this mean?Claude 3.5 Sonnet is Anthropic’s first entry into a new category of models. It is crushing benchmarks left and right. We're talking grad-level reasoning, undergrad-level knowledge, and some seriously slick coding skills. It's twice as fast as Claude 3 Opus but just as affordable, making it perfect for complex jobs like customer support or multi-step workflows.The vision capabilities got a major boost too. Claude 3.5 Sonnet is a whiz at stuff like reading charts and transcribing text from wonky images. This opens up a whole new world of insights for industries like retail and finance.But wait, there's more! "Artifacts" just dropped on Claude.ai, letting you create interactable previews with code. Think diagrams, games, one-off apps, and whatnot. It's like having an AI collaborator right there with you.3.5 Sonnet is also immediately available via the Anthropic API, Amazon Bedrock, and Google Cloud's Vertex AI.You want more? Anthropic says they have more stuff in the works with Claude 3.5 Haiku and 3.5 Opus coming later this year and features like memory for Claude the chatbot. They are not working on voice mode though (like OpenAI’s GPT-4o).Why should I care?It’s simple. OpenAI (and ChatGPT) is no longer the king of the ring. Claude 3.5 Sonnet blows OpenAI’s GPT-4o out of the water and I see Claude (the chatbot) becoming a much more delightful product to use as well. Reminder: Anthropic has recently hired Instagram’s co-founder Mike Krieger as their CPO.If you're using AI to get work done, Claude 3.5 Sonnet is about to become your new BFF. The combo of high intelligence, speed, and reasonable pricing is a productivity dream.Share this story
06/24
Stability AI gets a new CEO and a lifeline. The open-source AI darling that once hit a $1B valuation has been hitting some serious speed bumps since. But there’s a new push to put it back on track with the new CEO, Prem Akkaraju.🍿Our Summary (also below)
06/25
Major record labels are suing AI music companies. The big three music labels just dropped a legal bomb on two rising stars in AI music generation. The RIAA is leading the charge to take Suno and Udio to court over alleged mass copyright infringement. The labels want up to $150K per work in damages.🍿Our Summary (also below)
06/26
Claude gets the team spirit! Anthropic's latest update introduces "Projects" in Claude, letting you organize and share your AI chats with teammates. Projects have a 200k token context window that accepts custom docs, large codebases and custom instructions (to tweak Claude's personality per project). Claude is also getting a redesign with features like searchable chat history.
06/26
ChatGPT's desktop app for macOS is now live for all users. The shortcut Option + Space takes you to Chatty. The Mac app also has the option to search your chat history (ChatGPT web doesn’t). But hold your horses on that Voice Mode preview—OpenAI needs another month to perfect it. Expect a small alpha rollout soon, with all Plus users getting access this fall.
06/27
Figma’s AI features are here. Designers can now find assets faster, create realistic mockups, and easily turn static designs into interactive prototypes. These are broad integrations across their platform. Beta is available now, free through 2024.🍿Our Summary (also below)
06/27
YouTube's trying to sweet-talk record labels into an AI music deal. It wants to expand its experiments beyond "Dream Track" from last year. So it’s offering upfront payments to secure artist participation from major labels like Sony, Warner and Universal. The same groups recently sued Udio and Suno for training on copyrighted music.🍿Our Summary (also below)
06/28
Google drops major updates. It released two new open source models Gemma 2 9B and 27B. The 2M token context window for Gemini 1.5 Pro is now open to all devs. Last but not least, Gemini 1.5 models can now run code, in both AI Studio and Gemini API.
07/01
Adept joins Amazon. Adept, the agentic AI startup is changing its leadership. CEO and Co-founder David Luan and a few others are joining Amazon’s AGI team (plus licensing some of Adept’s tech). Looks like Amazon has still cards on the AI table.🍿Our Summary (also below)
07/02
Meta has updated its AI label on Instagram and other platforms. Instead of “Made with AI”, it’s now called “AI info”. This comes after many posts with small AI edits (eg. from Photoshop’s Generative Fill) were marked as made with AI.
07/02
Gen-3 Alpha by Runway AI is now available to everyone. It takes a few mins and $1-$2 to generate a 10-second video clip.
07/02
Figma is pausing its “Make Design” feature as a post of it copying Apple’s weather app goes viral.
07/02
Morgan Freeman calls out ‘unauthorized’ use of AI replicating his voice in a TikTok video.
07/02
YouTube now lets you request removal of AI-generated content that simulates your face or voice.
07/04
KyutAI, another French AI startup is heading straight for OpenAI. It just dropped Moshi - A GPT-4o-like model that can see hear and talk natively. Moshi is available to try today and KyutAI will open-source it too in the coming days.🍿Our Summary (also below)
07/08
License update from Stability AI - Stability AI just made their AI models free for most users, including small businesses (<$1M annual revenue). They're doubling down on open-source and trying to win back the community after a rocky SD3 launch.🍿Our Summary (also below)
07/09
OpenAI and Thrive Global are creating an AI health coaching company. Sam Altman and Arianna Huffington have written a post in Time. They’re teaming up to build an AI health coach. It’ll try to change user behaviour to help with chronic diseases and daily wellness.🍿Our Summary (also below)
07/09
Poe has released Previews - Interactable web apps made while chatting with AI—Claude Artifacts with any model you want. Previews work best with the top models like GPT-4o, Gemini 1.5 Pro and obviously Claude 3.5 Sonnet. Here’s the catch: they can be shared and viewed on the web (unlike Artifacts). No need to ask “Where do I deploy/publish this” anymore.
07/10
Correction on yesterday’s Health Coach Story: It’s a partnership between OpenAI and Thrive Global. Thrive Capital is a separate organization.
07/11
Samsung adds new AI features to its Galaxy AI suite. There’s chat assist, sketch to image, live translation and more. Mrwhosetheboss has the crispiest coverage of what’s new.
07/15
Rufus, Amazon's AI shopping assistant, is now available to all U.S. customers! It can answer product questions, compare options, provide recommendations, and give info about specific orders. It also answers generic, non-shopping-related queries in the context of “what should you buy.” Let’s see how this fares.
07/17
Big tech firms secretly used YouTube videos to train AI. An investigation from Proof News found that major AI companies like Apple, Nvidia, Salesforce and Anthropic used subtitles from over 170,000 YouTube videos to train their AI models—without creators' knowledge or consent.🍿Our Summary (also below)
07/17
Andrej Karpathy is starting an AI education company - Eureka Labs. Karpathy says that Eureka Labs will be an AI native school and his work at Tesla and OpenAI were side-quests for his full-time job/passion—teaching. The first product, LLM101n, promises to be an undergraduate-level course, teaching students to build their own AI assistant.
07/18
Menlo Ventures and Anthropic launched a $100M fund called Anthology to invest in AI startups. They're looking to invest in everything from seed to expansion stage, starting at $100k. The focus is on five key areas: AI infrastructure, frontier apps, consumer AI, trust and safety tools, and AI for social good.🍿Our Summary (also below)
07/18
Spotify's AI DJ can now chat you up in Spanish, amigo. Time to salsa your way through some tunes! Spanish DJ Livi is launching in Spain and across 17 Latin American countries, including Mexico, Argentina, and Colombia.🍿Our Summary (also below)
07/18
Salesforce debuts Einstein Service Agent, a new AI agent for customer self-service.
07/19
OpenAI has a new baby - GPT-4o-mini. It is a small model that beats the early version of GPT-4 and it costs 30x less than its big bro GPT-4o. You can now say bye to GPT-3.5-Turbo as 4o-mini will take its place in ChatGPT’s free tier and the API.🍿Our Summary (also below)
07/23
Condé Nast slaps Perplexity AI with cease-and-desist! The magazine giant demands the AI search engine stop using its content, following Forbes' similar move. Perplexity, now valued at $3B, is facing growing legal challenges over its content use.
07/24
Meta puts open-source AI on the podium. Llama 3.1 series of models came out yesterday with 3 new models—Llama 3.1 8B, 70B and 405B. While the 405B version is not the best model out there, open-source is now on the podium with GPT-4o and Claude 3.5 Sonnet. Another surprise: Meta’s technical report is the bible for LLM training.🍿Our Summary (also below)
07/24
You can now fine-tune GPT-4o-mini—and do it for free for the next 2 months. OpenAI wants you to customize its shorty but smarty model to your tasks. It is allowing free training up to 2M tokens every day till Sep 23rd. The war between Llama fine-tunes and GPT fine-tunes is ON.
07/25
Bing has an answer to Google’s AI overviews. It’s called Bing Generative Search. The AI-generated answer takes the main spot on the search results page with the blue links pushed to the right side. Bing says they have taken extra care for accuracy (don’t wanna see people eating rocks now, do we?).
07/25
Airtable is launching an app builder - The new Cobuilder tool lets you create custom apps instantly using AI. Just use your data in tables, give a prompt and get a custom app. Build, customize and share the apps with your team, without thinking about code or design.
07/25
Mistral AI unveils Mistral Large 2, a 123B parameter model with 128k context. It’s stronger in code, math, and reasoning, and supports a ton of languages (both human and code). It is open-source and available on La Plateforme too.🍿Our Summary (also below)
07/26
Google has also released two AI models: AlphaGeometry 2 and AlphaProof which scored a silver medal at the International Mathematics Olympiad 2024. It was just 2 points behind the Gold Medal. The current version has less to do with LLMs, but some of this work will likely make it to the Gemini models.
07/26
As tradition holds, soon after Google’s news, OpenAI previews SearchGPT. It’s OpenAI’s (improved) attempt to bring search results into ChatGPT. Citations are much more prominent, answers have custom UIs with visual elements and content from OpenAI’s recent publisher partnerships. It’s a “prototype” in OpenAI’s words and behind a waitlist. So, we don’t know much—but its demo already got something wrong.
07/29
Anthropic is under fire for aggressive data scraping! Websites like Freelancer.com and iFixit.com accuse it of bombarding their sites with bot armies to harvest training data (think 3.5M requests in 4 hours). They say Anthropic is playing dirty, ignoring internet etiquette, and making it way more expensive to keep their websites running.
07/30
Meta is launching an AI Studio. You can create custom AI characters for Instagram, Messenger, and WhatsApp (US only). No tech skills are needed to build these characters—just prompting. Example: AIs for cooking tips, meme generation, or creator clones that answer fan questions.
07/31
Friend, an AI necklace (with no voice output) promises to never let you feel lonely anymore. The launch video has 12M+ views in less than 24 hours with much speculation about its $1.8M domain name and the future of AI companions.
08/01
Microsoft's AI business is booming, with TikTok as a major customer. TikTok spent $20M/month on Azure OpenAI Service, accounting for 25% of its $1B annual revenue. Read that again, there’s an M after $20. Other big spenders include Walmart, Intuit, and G42.
08/01
Microsoft isn’t paying to use Reddit’s data. Reddit CEO Steve Huffman calls out Microsoft, Anthropic, and Perplexity for scraping Reddit's data without permission. Huffman demands payday deals like the ones with Google and OpenAI.
08/02
Google Chrome to get 3 new AI features: Search with Google Lens on Desktop, Gemini-assisted lookup for browsing history and Tab Compare to judge products across multiple tabs.
08/02
a16z is leading a $31M seed in Black Forest Labs, founders include OG Stable Diffusion creators.
08/05
Google buys out Character AI’s tech and talent. Character AI (CAI), is one of the hottest consumer AI products with 100M+ monthly visits. Can this move finally help Gemini win over ChatGPT? It's a homecoming for Noam Shazeer, co-author of the original Transformers paper at Google, and his team.🍿Our Summary (also below)
08/07
Amazon Music launches Topics to make podcast discovery easier with AI. It analyzes transcripts to tag episodes by subject matter. The number of topics and content under this feature is still limited. It’s now available in the US on iOS and Android.
08/07
S&P Global is paying Accenture to upskill 35K employees in AI. Can someone let them know about Ben’s Bites?
08/09
Free ChatGPT users can now generate 2 images per day. Too little but hey it’s more than zero.
08/09
Zico Kolter from Carnegie Mellon joins OpenAI’s Board as the technical AI safety expert. He’ll also join the Safety & Security Committee.
08/09
Google doubles down on Gemini 1.5 Flash and AI studio. Gemini 1.5 Flash prices are now slashed by up to 78%. AI Studio, their app for testing these AI models now supports 100+ languages and gets easier access through Google Workspace. The best one though, is PDF understanding with text + vision across AI Studio and Gemini API.🍿Our Summary (also below)
08/09
Microsoft and Palantir team up to sell AI to US defence and intelligence agencies.
08/09
JPMorgan Chase is giving its employees an AI assistant powered by ChatGPT maker OpenAI.
08/13
AI Scientist by Sakana AI - Can AI discover new science? Sakana AI built an AI (based on LLMs) that does open-ended exploration and finds research ideas. It then tries to implement them, writes a paper (in <$15), and reviews it as well. This AI scientist’s first goal is to find ideas to enhance itself i.e. ideas for better ML research.🍿Our Summary (also below)
08/13
Universal Music and Meta just inked a multi-year deal. The agreement expands their partnership across Meta's platforms, including WhatsApp for the first time. No specific deets are revealed but it promises fair compensation for artists, tackles AI-generated content concerns, and opens up new monetization opportunities.
08/14
Google demos Gemini Live—a real-time voice assistant, at the Pixel 9 launch event. And it might even beat OpenAI in shipping it to the users 🤞. There was more about Gemini on Android in the event, and it looks better than whatever the new Siri was supposed to be. 🍿Our Summary (also below)
08/14
xAI just dropped the mic with Grok-2, their latest AI model. This one isn’t just about Musk’s gimmicks of anti-woke or slur-filled roasts. You can count it in the top 5 models for now. Also, Grok on Twitter has a nicer UI and image generation features now.🍿Our Summary (also below)
08/15
On Monday we told you that Flux models are taking over Twitter. Well, the scale of that takeover has increased. The new Grok is integrated with Flux lets you make images from those models, and they're totally uncensored. Some people love it, and others are just confused. It's wild out there!
08/15
SAG-AFTRA makes a deal with Narrativ for audio voice replicas in digital advertising.
08/16
Midjourney is releasing a new web editor. to combine separate image actions into a single UI. ps: where do you use Midjourney: Web or Discord?
08/19
Procreate will never adopt Gen AI. With a brief 40-second video and a 150-word statement on its website, the digital art app made its position clear:No generative AI in their products, forever.Users retain the right to their work.Creativity is made, not generated.We don’t think it’s the right stance, but it is a bold one. Let’s see how the future shapes.
08/19
OpenAI banned a cluster of Iranian accounts trying to spread misinformation. They were using ChatGPT to generate fake and sensational headlines. Too bad for them, these posts didn’t get any engagement. 🍿Our Summary (also below)
08/19
56% of Fortune 500 companies cite AI as a “risk factor” in their recent annual reports.
08/21
You can now fine-tune GPT-4o. Yup, you read that right. The “best” model right now is up for customization, and not just adding fancy instructions over it. Fine-tuning will adapt the core model to your use case. OpenAI is making this free for up to 1M tokens every day throughout September.
08/21
Luma Labs released Dream Machine 1.5. It comes with higher-quality text-to-video and improved image-to-video. It also has a smarter understanding of your prompts and custom text rendering.
08/21
OpenAI partners with Condé Nast i.e. content Acc. to the deal, content from Vogue, The New Yorker, GQ, Architectural Digest, Vanity Fair, Wired, and more, will be shown in ChatGPT and SearchGPT prototype.
08/22
a16z released their latest Top 100 Gen AI Consumer Apps report! Creative tools make up 52% of top web apps. Overall ChatGPT remains #1, but competition from Claude and Perplexity is getting stronger. Not much of a shocker but early Discord traffic predicts future app success. Also, new category alert: AI for aesthetics and dating.🍿Our Summary (also below)
08/22
Midjourney’s web platform is now open to all. No more compulsion to use Discord’s interface. They’ve also re-enabled free trials.
08/26
Last weekend Twitter woke up to a (not new) coding AI startup called Cursor. Cursor is a fork of VS Code but with tons of AI features and UI improvements. Even Karpathy is talking about it.
08/28
Google released three new models in a day.An improved version of Gemini 1.5 Flash (Exp 0827)An even smaller version of Flash: Gemini 1.5 Flash 8B.An experimental version of Pro: Gemini 1.5 Pro (Exp 0827).More deets on the models and how to access them in 🍿Our Summary (also below)
08/29
Google made two updates to Gemini yesterday:Gems - Gems are custom versions of Gemini with specific instructions to act in a certain way. It's kinda like "Custom GPTs," but right now, it's mostly just a fancy prompt-saver. Gems are cuter than GPTs though.
09/03
ChatGPT now attracts 200M+ weekly active users and Zuck claims the number is 185M+ for Meta AI.
09/05
Ilya Sutskever’s new company Safe SuperIntelligence raises $1B, while OpenAI is trying to make custom chips, kickstart an infra-building plan, and create new Twitter accounts.
09/10
Replit introduced its AI agent, which creates apps for you. Perfect for non-coders like me, who were still left behind because I don’t know what to do with AI-generated code. Replit Agent uses the tools that it already has to help you make the app live. I made one yesterday (and a few more since)
09/10
OpenAI says 1M people use ChatGPT business now. OpenAI is also considering more ChatGPT plans at higher price points (ranging from $200 to $2k) Employers: get your employees a team subscription, because they are already using AI, with self-paid accounts.
09/10
TIME magazine released its annual AI 100 list - A list of the most impactful people in AI. As always, it attracted criticism for not including people in the thick of the AI development.
09/12
NotebookLM by Google is an underrated AI app to work with large documents. and it just got a new capability. It can now generate an expressive Audio Overview for any text material you put in. These overviews are not simply reading the text but rather feel like two people talking about the content you provided.
09/12
Klarna shuts down Salesforce and Workday in favour of internal tools built with AI.
09/17
OpenAI released two new models o1-preview and o1-mini. These are the results of their “Strawberry” project. o1 models take their time thinking about a problem before giving a final answer and thus perform much better at reasoning or long-planning-based tasks. Both models are available in ChatGPT Plus, and Team plans for use. 🍿All you need to know about o1.
09/17
Salesforce has released its agent builder called AgentForce, which allows you to create bots capable of taking action on their own, within established limits.
09/17
RunwayML announces a video-to-video mode for their Gen-3 Alpha model. Making videos in custom styles is going to get so much easier.
09/19
Snap is also launching a new AI video-generation tool for creators. Currently in beta for select creators on the web, the tool allows video creation from text prompts, with image prompts coming soon. Snap says it’s using its own AI models to generate the videos. ps: Snap also launched its AR glasses called Spectacles. I have the OG spectacles when you could get them from a vending machine - used them once, while skiing.
09/24
Sam Altman’s new blog claims that we are entering a new human age: The Intelligence Age. He casually drops that superintelligence could be achieved in a few thousand days.
09/24
OpenAI is launching OpenAI Academy to train devs and orgs with $1M in API credits and more resources. Google also announced a $120M fund for AI education.
09/24
Amazon launched two new AI tools in beta: 1) A selling assistant named Amelia and 2) A video ads generator for Amazon Ads.
09/26
Meta AI is getting more useful. Zuckerberg announced a ton of impressive AI upgrades at Meta Connect—including voice and image understanding in Meta AI, four new Llama models, features for creators, and a bonkers AR glasses prototype. 🍿 Our Summary
09/26
OpenAI also released their much anticipated Advanced Voice mode for ChatGPT. It has a new look and 5 new voices. But what’s OpenAI without some content for an HBO show: A day after the release, CTO Mira Murati and two other top execs are leaving the company.
09/26
Notion released a revamped version of Notion AI. You can now search across all your knowledge bases inside Notion and connected apps like Slack and Gdrive. It has a fresh look too, moving away from the AI sparkle logo.
09/26
Google subtly launched three new models on Tuesday under the same Gemini 1.5 branding, with major price reductions.
10/01
California’s governor Gavin Newsom vetoed SB 1047 - The controversial AI safety bill.
10/03
OpenAI launched 4 key upgrades for developers at their SF chapter of Dev Day. The biggest update is a new Realtime API that allows developers to create voice interactions like Advanced Voice mode. Others include prompt caching, fine-tuning on images and model distillation. It has also officially announced raising $6.6B at a $157B valuation (biggest venture round ever).
10/03
Microsoft's giving Copilot a proper glow-up, adding voice and vision to its consumer version. Boss-man Mustafa Suleyman says it'll be your new best mate, always there. Oh, and the new design for the app got a whiff of Pi about it—lmost like they used the same designer. Oh wait, they did!
10/08
ChatGPT has a new feature: Canvas. It’s a document editor inside ChatGPT available for the two most important use cases of LLMs—writing and coding. It is in beta but I took it for a spin. For writing, you can make manual changes to the AI’s output plus change its length, get suggestions and add emojis (way too many!). There’s a different suite of functions for coding too. 🍿 Our Summary.
10/08
Meta jumps into video generation with Meta Movie Gen. It can create custom videos from text, edit existing footage, and transform personal images into videos. Top it off with AI audio and sound effects too. Turns out Zuck isn’t just serving cute Llamas, he’s coming for OpenAI’s Sora too. This showcase comes with a (expected and appreciated) research paper—now fingers crossed for the model.
10/10
This week AI researchers grabbed two Nobel Prizes. Geoffrey Hinton and John Hopfield were awarded the Nobel Prize for Physics for their work on neural networks in the 1980s. Also, the Nobel Prize for Chemistry was shared by Demis Hassabis and John Jumper for their work on protein folding with AlphaFold2 in 2020, with biochemist David Baker.
10/15
At the end of last week, Anthropic’s co-founder Dario Amodei put his thoughts on the potential upsides of AI in a 15,000-word essay. Dario talks about the radical changes and how they can be achieved practically in 5 areas
10/15
Adobe announced a ton of generative AI features at Adobe Max. Photoshop got Distraction Removal to eliminate unwanted elements in one click, while Premiere Pro’s Generative Extend helps fill in small gaps (~2 secs) in footage. Their latest experiment is Project Concept, a tool for rapidly exploring creative directions using Firefly AI models.
10/17
Dropbox launches Dash for Business - AI-powered universal search across work apps. Find, summarize, and organize content with natural language queries. It has smart collections (Stacks) and admin controls for sensitive content. Dash works even if your files are not stored on Dropbox.
10/24
Claude learns to use computers - Anthropic released a “New” version of Claude 3.5 Sonnet with a) a massive boost to its coding powers (49% on SWE-bench, up from 33%) and b) the ability to navigate interfaces and use software like humans do (in beta).
10/31
OpenAI released a new benchmark called Simple QA. It includes 4000+ single-line factual questions. It’s designed to be hard for most large language models. Claude and GPT models perform similarly on the benchmark but numbers for Gemini were not reported. I ran 10% of those questions with Gemini models. The score for Gemini-1.5-Flash is comparable to GPT-4o-mini and the same is true for Gemini-1.5-Pro and GPT-4o.
11/05
ChatGPT Search is live for Plus and Team users. After running the SearchGPT prototype, OpenAI has finally added it to the main ChatGPT experience. It searches the web automatically when it thinks it might be necessary (which might be annoying for some).
11/05
You can now dictate up to 10 minutes of voice messages on the Claude mobile app and Claude now supports PDF uploads with images and charts in it too (up to 100 pages).
11/05
What if all of Minecraft was AI-generated? Well, we have some early results from Oasis: the first playable AI-generated game from Etched and Decart AI.
11/07
chat.com now points to ChatGPT. Dharmesh Shah, CEO of HubSpot, who bought the domain last year for (reportedly) $15.5M, gave a cryptic prompt to figure out the sale price in his LinkedIn post.
11/07
Where can my team start using AI? This is a common question in our community. Do I hire a consultant? Hire for a new AI expert role? The simple way to start is with people already on your payroll. Try this: Let your team explore our tutorial library for an hour. We've organized it by function - marketing, sales, data, and operations. Ask them to pick 4-5 tutorials that resonate with their daily work—the tasks they spend hours on each week.
11/12
Semrush's massive study (200K keywords) reveals that AI overviews dominate long-tail informational searches but barely touch transactional queries. Only 46% of overviews include #1 organic results and pull links from way beyond the top 10 rankings.
11/14
Slack's latest survey shows a surprising reality check on AI usage at work. Two themes stand out: Lack of proper training - 61% of desk workers have spent less than 5 hours learning AI. Fear of using AI - Nearly half of desk workers will be uncomfortable telling their managers that they used AI.
11/19
Mistral just copied ChatGPT's homework and made it free. Their chat app now has multiple things that ChatGPT Plus offers—web search, image generation, agents (like GPTs) and even a Canvas feature for co-editing with AI. Thanks to their new 124B model Pixtral Large, it also understands documents (with images and charts too). You can try it at chat.mistral.ai (plus there’s API and weights access to the model for devs and researchers).
11/19
ChatGPT's macOS app can now work with apps to provide better answers. Well, only coding tools for now like VS Code, Xcode, and Terminal. Currently in beta for Plus and Team users, this feature lets ChatGPT see your code, terminal outputs, and editor content to give more contextual responses.
11/19
Stripe launched an agent toolkit. It will allow AI agents to handle payments, billing, and financial tasks. The toolkit also integrates with major AI agent frameworks making it easier for developers to build AI apps that can charge users based on metered usage, process refunds, and handle billing.
11/19
Windsurf by Codeium - New code editor on the block, giving Cursor some competition.
11/19
TikTok’s ad generation tool is now available to everyone (with Getty Images integration).
11/19
And some drama: a compilation of emails released in the Musk vs Altman court case.
11/21
Gemini finally got a feature to save information about you (they had a prototype when it was called Bard—RIP). It’s called “Saved Info” and is similar to Memory in ChatGPT. But unlike ChatGPT’s auto-saved memories, you explicitly tell Gemini what to remember about you. It’s less magical, but I am cool with it—it doesn’t inflate expectations.
11/26
Model Context Protocol (MCP) for broader coverage of “AI + data” connections. Instead of creating custom integrations for each data source, developers can now use MCP as a universal connector between AI assistants and tools/databases. It’s only local on your machine for now
11/28
Styles in Claude is a new way to customize its responses. You can now choose from preset options (Concise, Explanatory, Formal) or create custom styles by uploading your own writing samples. No more repeated prompting to match your preferred communication style.
11/28
OpenAI's Sora (the unreleased video generator) was temporarily leaked on Hugging Face by a group protesting early access program issues. The leak allowed users to generate 10-second, 1080p videos before being shut down. The group claims OpenAI is pressuring testers to spin positive narratives while not fairly compensating them for testing and feedback.
12/03
Tencent just released a video-generation model. Its generations are rated better than RunwayML’s Gen-3, Luma and other Chinese models. It's open source AND small enough (13B params) to run on your laptop! This year AI images went from "hmm, that looks AI-ish" to "wait, that's AI?"... well, looks like 2025 is gonna do the same thing to video.
12/03
Grok in X now refers to your username and profile image to create funny images like “Draw me as a Pixar character”. Quick nerdy side note: There are two models at work here, Grok (sees images and describes them), and Flux model by BFL (makes images). Usually, this kind of patchwork is why image generation in other AI apps feels... well, patchy. But, as models are getting better, the gaps are getting harder to point out.
12/05
Amazon is finally in the AI game. It has launched a new set of foundational models - Amazon Nova. These are not the best-performing models but they are much cheaper than comparable champs out there. It is enough to keep AWS customers on their platform and other features might attract new business too.
12/05
GenCast, a new weather prediction model uses image-gen techniques to beat current prediction models.
12/05
Two major AI labs have unveiled systems to generate explorable 3D environments - Google DeepMind's Genie 2 and World Labs' world generation system. Both convert 2D images into interactive 3D spaces, but take different approaches to the challenge.
12/10
xAI now has its own image generation model, code-named Aurora. It now replaces Flux for generating images on Twitter’s Grok tab. Also, free Twitter users now have limited access to Grok.
12/10
Meta released Llama 3.3 70B - with comparable performance to Meta’s Llama 3.1 405B (and old GPT-4o). Zuck says the next stop is Llama 4 in 2025.
12/10
a new $200/month Pro tier. ChatGPT Pro gives you unlimited access to everything OpenAI is cooking up.
12/10
OpenAI has finally released Sora, their text-to-video model, as a full-blown product.
12/10
And in other side quests, Google made a new quantum chip. Maybe nothing, maybe something ¯\_(ツ)_/¯
12/12
Google kicked off the Gemini 2.0 series yesterday and it’s full of bangers. Here are the highlights: The new Gemini 2.0 Flash model - It’s fast and matches 1.5 Pro's performance. It’s available in Gemini Advanced as an experimental model.
12/12
Google just gave Gemini a serious research upgrade. The new Deep Research mode can digest hundreds of websites at once and compile everything from market trends to technical analyses.
12/12
Gemini 2.0 Flash has multimodal output, i.e. it can generate audio (available now) and images (yet to come) in real-time. Plus the X factor, this model has native tool use and is perfect for agents. Right now, it’s free with some limits but it’ll likely be dirt cheap. It’s a feast for developers.
12/12
Google’s future is all about agents (set to come in 2025). We have updated Project Astra - a live voice companion, Project Mariner - an agent that can use your browser, Jules - a bug-fixing agent plus Agents in Games, Colab and more.
12/12
Midjourney quietly launches "Patchwork" - a multiplayer worldbuilding tool that mixes AI image gen with storytelling.
12/12
Devin - the AI software engineer is now generally available starting at $500/month.
12/17
Google has announced Veo 2 - their video generation model. Results from early testers look better than Sora, especially when movement/physics is involved.
12/17
ChatGPT gets Projects - A much-needed way to organize ChatGPT’s chats (and a much better experience than Custom GPTs).
12/17
Midjourney Moodboards - Use a collection of images to personalize models with just a minimum of 40 rankings and stable performance with 200 rankings.