Image by Author
Chatgpt has created quite a buzz in the world of AI. We have been witnessing numerous other models with incremental improvements. But none of them focused on improving the interaction between humans and AI. You still need to give it an excellent prompt to get your desired results. This is where AutoGPT stands out. It can “Self-Prompt” and reviews its work critically. Are you curious to know about it? How does it work, and what makes it unique? And perhaps most importantly, what are its limitations? Don’t worry, we’ve got you covered. Let’s explore all of these questions in this article. Join me as we delve into the topic together.
AutoGPT is an open-source application developed by Toran Bruce Richards ( Game Developer and Founder of Significant Gravitas). It uses GPT-3.5 or GPT-4 APIs to create fully autonomous AI agents. It stands out because you don’t need to steer the model based on your understanding. You just provide the task along with the list of objectives and it handles the rest. Unlike ChatGPT it can also access external resources to make its decision. Did you know that it has obtained more stars than Pytorch (A famous open-source ML Library) within a few weeks of its release? Here is a graph showing its star history.
Image Generated by Star-History
Image by Author
AutoGPT combines the power of the GPT-4 and personal assistant to generate, execute and prioritize the tasks autonomously. Being an autonomous system, it creates AI agents to perform specific tasks. These agents also communicate with one another. Here are the steps that describe how AutoGPT works:
Step 01: Input from the User
Firstly, the user needs to enter the following three inputs: AI Name, AI Role, and up to 5 goals. For example, I can create an AI named MarketResearchGPT and its role will be to conduct the market analysis of different items. I can set goals like Performing market research for different phones, Getting the list of top 5 with their pros and cons, Arranging them in ascending order of their prices, Summarizing their user reviews, and Terminating the process when done.
Step 02: Task Creation Agent
Once the user has entered the input, the task creation agent understands the goal, generates the list of tasks, and mentions the steps to achieve them. Then the resultant set of tasks is passed to the task prioritization agent.
Step 03: Task Prioritization Agent
The task prioritization agent reviews the sequence of the tasks to ensure that it logically makes sense. Because we don’t want to enter a deadlock situation where our current task depends on the result of the task that has not been executed yet.
Step 04: Task Execution Agent
Task Execution Agent as the name suggests makes use of GPT-4, the Internet, and other resources to perform these tasks.
Step 05: Communication Between Agents
Agents can communicate with each other to reach the user-defined goal. For example, if the unsatisfactory results were generated then it can communicate with Task Creation Agent to generate a new list of tasks. Hence, it becomes an iterative process.
Step 06: Final Result
The actions of these agents are visible on the user end in the following form:
Thoughts: AI agent share their thoughts after completing the action
Reasoning: It explains its choices of why is it choosing a particular course of action
Plan: The plan includes the new set of tasks
Criticism: Critically review the choices by identifying the limitations or concerns
It also uses external memory to keep track of history and learn from its past experiences to generate more precise results.
Although AutoGPT and ChatGPT are built on top of the same technology which is GPT API, we can pinpoint some key differences that are as follows:
Access to Real-Time Data
ChatGPT uses the latest model of GPT-4 that is trained up to September 2021 which means that we cannot extract the real-time insights. AutoGPT has access to external resources and incorporates the latest trends into its responses.
Unlike ChatGPT, which requires constant prompts from the user, AutoGPT is autonomous in this regard and doesn’t require constant prompting. It really helps in idea generation.
ChatGPT has memory limitations in the form of context windows of LLMs like GPT-4 while AutoGPT uses vector databases and is suitable for both short and long-term memory management.
Image and Speech Functionalities
ChatGPT is limited to only textual data while you can generate images and convert text to speech using AutoGPT.
You will need an OpenAI API key as AutoGPT is built on top of GPT. If you don’t have one, you can sign up for a free account to get some free credits. Follow the steps below to set up AutoGPT on your local computer.
Setting it Up
Clone the GitHub repository in your local directory using the following command:
git clone https://github.com/Significant-Gravitas/Auto-GPT.git
Navigate to the project directory using the following command:
Run the following command to download the required dependencies:
pip install -r requirements.txt
Locate the “.env.template” file in your Auto-GPT folder. Kindly check the hidden files too if you are not able to find them. Create a copy of this file using:
Open the .env file and replace the OPENAI_API_KEY with the key that you generated from your account. Save and close the .env file.
Run the below command to start AutoGPT:
And if you are using GPT-3.5 then you can run:
python -m autogpt --gpt3only
You are good to go now. In case of any issues please refer to the official documentation: Auto-GPT Setup
Although AutoGPT can generate content with minimal human intervention, it has some major downsides such as high costs, limited functionality, inadequate understanding of context, data bias, limited creativity, and security risks. It is not yet capable of achieving the AGI (Artificial General Intelligence) due to data quality, generalization, and explainability issues. Despite the shortcomings, it has huge potential to revolutionize our daily lives and the way we work. I hope you enjoyed reading the article and do let me know in the comment section about what you think about AutoGPT.
Kanwal Mehreen is an aspiring software developer with a keen interest in data science and applications of AI in medicine. Kanwal was selected as the Google Generation Scholar 2022 for the APAC region. Kanwal loves to share technical knowledge by writing articles on trending topics, and is passionate about improving the representation of women in tech industry.