A dance with LLMs and whirlwind of generative AI gems

About a year and a half ago, I ventured into the exciting world of large language models (LLMs). Time flies, right? I joined a crew at Google that was all about pushing the limits of these language-savvy machines.

During this period, each quarter felt like a year, but each day felt like a flash. I had the honor of being part of the Gemini API launch party. Two of my major projects from 2023: Semantic Retriever API and Function Calling capability of Gemini were released in beta as part of the Gemini launch.

After the launch madness subsided, I finally had some time to chill over the holidays and let my thoughts merry-nate. Now, I’m writing this post to reflect on my journey with large-ish models.

Early Adventures and Lessons Learned

We experimented with all sorts of fun stuff. We taught LLMs to be culinary wizards, helping fast food joints take orders. We even transformed our office coffee bar into an AI-powered barista. Talk about a caffeine boost!

Others played around with LLMs as brainstorming buddies, news aggregators, and helpful tools for folks who work with their brains all day. And let’s not forget the ambitious soul who thought waving an LLM at their calendar would magically align all their meetings. Well, that didn’t quite pan out.

We realized that LLMs were great at being chatty, but we needed to keep them on track. After all, no one wants an AI that spews nonsense. We also bumped into challenges when LLMs tried to handle complex inventory systems or regional language quirks.

It became clear that LLMs excelled at creative tasks like writing and brainstorming. But the trick was to make them both imaginative and grounded.

Through all this, two things stood out:

  1. LLMs craved access to the outside world.
  2. They loved having extra tools to help them out.

Supercharging LLMs: Function Calling and Semantic Retrieval

That’s when I got my hands on some groovy features. “Function calling” is like handing LLMs a magic wand, empowering them to summon the powers of other software systems – delegating tasks is the key. “Semantic retrieval” gives them a library card, granting access to a vast ocean of knowledge so that models can stay grounded within a set of sources.

LLMs: The Ultimate Computer Whisperers

LLMs are the missing link between humans and computers. They’re like super-intuitive touchpads that can make communication a breeze. They can help us navigate massive knowledge bases, make sense of huge databases, and understand relationships in complex networks.

Imagine a future where websites transform into interactive playgrounds with chat boxes and display canvases. LLMs will be the chatty concierges, summarizing content on the fly and displaying detailed information as you browse.

Generative AI: Looking forward**

Generative AI can make our lives better and help us tackle big challenges. LLMs can be personalized tutors in education, shatter language barriers and democratized access to information. And get ready for a revolution in content creation! 2024 will see a boom in AI-powered image, video, and other media tools.

But wait, there’s more! We need to figure out the whole copyright and authenticity thing. Maybe tracking where ideas come from (provenance) and sharing royalties can strike a balance between creativity and protecting artists’ rights.

Final Musings

2024 is going to be a wild ride for AI. I’m stoked to see how LLMs will continue to evolve and shape our lives. Let’s navigate the ethical waters responsibly and harness this technology for the betterment of humanity. After all, the future is ours to chat with!

This post is licensed under CC BY-NC-SA 4.0 by the author.