- September 2024 (1)
- August 2024 (2)
- July 2024 (2)
- May 2024 (2)
- April 2024 (2)
- February 2024 (2)
- April 2023 (1)
- March 2023 (2)
- September 2022 (1)
- February 2022 (1)
- November 2021 (1)
- March 2021 (1)
- February 2021 (2)
- August 2019 (1)
- November 2018 (1)
- May 2017 (1)
- December 2016 (1)
- April 2016 (1)
- August 2015 (1)
- December 2014 (1)
- August 2014 (1)
- March 2014 (1)
- December 2013 (1)
- October 2013 (3)
- September 2013 (4)
- August 2013 (2)
- July 2013 (1)
- June 2013 (1)
- February 2013 (1)
- October 2012 (1)
- June 2012 (1)
- May 2012 (1)
- April 2012 (1)
- February 2012 (1)
- October 2011 (1)
- June 2011 (1)
- May 2011 (1)
- April 2011 (1)
- March 2011 (1)
- February 2011 (1)
- January 2011 (1)
- December 2010 (3)
- November 2010 (1)
- October 2010 (1)
- September 2010 (1)
- August 2010 (1)
- July 2010 (1)
- May 2010 (3)
- April 2010 (1)
- March 2010 (2)
- February 2010 (3)
- January 2010 (4)
- December 2009 (2)
- November 2009 (5)
- October 2009 (2)
- September 2009 (2)
- August 2009 (3)
- July 2009 (1)
- May 2009 (1)
- April 2009 (1)
- March 2009 (5)
- February 2009 (5)
- January 2009 (5)
- December 2008 (3)
- November 2008 (7)
- October 2008 (4)
- September 2008 (2)
- August 2008 (1)
- July 2008 (1)
- June 2008 (1)
- May 2008 (1)
- April 2008 (1)
- January 2008 (5)
- December 2007 (3)
- March 2007 (3)
- February 2007 (1)
- January 2007 (2)
- December 2006 (4)
- November 2006 (18)
- 3D (5)
- AI (14)
- Admin (3)
- Blogging (5)
- Business of Software (9)
- Copyright (1)
- Dirigible (3)
- Django (1)
- Eee (3)
- Finance (6)
- Funny (11)
- GPU Computing (2)
- Gadgets (8)
- JavaScript (1)
- Linux (13)
- Memes (2)
- Meta (7)
- Music (4)
- NSLU2 offsite backup project (13)
- OLPC XO (2)
- Oddities (4)
- Personal (3)
- Politics (3)
- Programming (61)
- Python (36)
- PythonAnywhere (12)
- Quick links (2)
- Rants (4)
- Raspberry Pi (1)
- Resolver One (22)
- Resolver Systems (18)
- Robotics (8)
- Space (2)
- Talks (3)
- Uncategorized (5)
- VoIP (2)
- Website design (4)
Building an AI chatbot for beginners: part 0
[Note that this series kind of dried up; when I started the series, I knew that I knew very little about the subject, but I was hoping to learn better by learning in public. However, as time went by it turned out that this wasn't working. There are a lot of better tutorials out there!]
Like a lot of people, I've been blown away by the capabilities of Large Language Model (LLM) based systems over the last few months. I'm using ChatGPT regularly for all kinds of things, from generating basic code to debugging errors to writing emails.
I wanted to understand more about how these tools worked, and feel strongly that there's no better way to learn something than by doing it. Building an LLM is, at least right now, super-expensive -- in the millions of dollars (although maybe that will be coming down fast?). It also requires a lot of deep knowledge to get to something interesting. Perhaps something to try in the future, but not right now.
However, using LLMs to create something interesting -- that's much easier, especially because OpenAI have a powerful API, which provides ways to do all kinds of stuff. Most relevantly, they provide access to a Completion API. That, as I understand it, is the lowest-level way of interacting with an LLM, so building something out of it is probably the best bang for the buck for learning.
Over the last few weeks I've put together a bunch of things I found interesting, and have learned a lot. But it occurred to me that an even better way to learn stuff than by building it is to build it, and then explain it to someone else, even if that person is an abstract persona for "someone out there on the Internet". So: time for a LLM chatbot tutorial!
I'm writing this addressing myself as I was a few months back, so if it's of interest to anyone, it'll be for people who know Python reasonably well, but who haven't spent much or any time coding for LLMs. If you already know AI or Natural Language Processing well, you probably won't find anything new here (though I would be very grateful for any corrections!).
My goal is to keep things as simple as possible. The bots I've built in the past have been web-based or use Telegram to interact with the user. That adds on unnecessary complexity from the viewpoint of a tutorial, so I'll show how to build a basic chatbot that runs in a console as a single process. You'll run
python bot.py
...and be able to type in your messages. The bot will respond, using OpenAI as a backend.
So far I've planned out the steps required to go from a basic framework that bounces stuff off OpenAI, to something that can actually have a conversation, to something that avoids the obvious forms of prompt injection, to something that can have long-running conversations and understand the context, even if a user's message refers to something they said some time ago.
I'll be posting these as I write them. Hopefully there are people out there who'll find them useful!