Sam Breed

Product Developer, Investor

Links, November 2023

Every month I turn my open browser tabs into a blog post.

SDVideo is good with clouds and reflections but is otherwise hit or miss

The Year of the Large Language Model

It has been one year since ChatGPT was released. Being in the AI space before then now qualifies as having been “early.” And things still seem to be speeding up.

Although multi-modal capabilities have existed in some fashion for some time (namely, that CLIP embeddings offer a shared latent space for text and images), crossing between modalities seamlessly with a single model has been challenging. No more.

The ChatGPT-4V lineage of models successfully interpret and produce images, which has almost immediately led to some very interesting use cases. Watch the video below from Twitter user @tazsingh to get the sense for what I mean.

How to draw an owl with @tldraw

https://twitter.com/tazsingh/status/1729578330200891552

The first link on deck is about the platform used to make the demo, tldraw.

What’s more, good video generation seems to be right around the corner. Stability AI released SDVideo this month, their model and weights for generating short 1-2s of video. The video at the top of the page of a oceanside mountain framed by clouds was first generated in Dalle3 and then animated with SDVideo. Of a number of samples I tried, this was the best one. It’s not quite there yet, but it’s getting close to being quite good.

Skeptics may argue that LLMs are neither new nor special, and that the hype around them is just that. But the bottom line is that computers are getting new features and abilities that were very out of reach even 4 years ago. There is something happening here.

AI

Front End

Misc

Tech

And one very nice personal website:

→ Reply

“Blog”