
Komos AI
AutomationTurn screen recordings and plain-language instructions into AI automations that run browser tasks, integrations, schedules, and webhooks.
Overview
Komos AI is an AI automation platform where agents learn workflows by watching a screen recording and listening to your explanation, then build the automation for you. It can combine browser automation with integrations such as Gmail, Google Drive, Google Sheets, Notion, and many more, making it useful for repetitive web tasks that would be tedious to build manually in tools like Zapier or n8n. It is strongest for structured browser and SaaS workflows, especially when you can use integrations for login-heavy or bot-protected sites.
Platforms
- Web
Video review
Prefer YouTube? Open this review on YouTube.
Video transcript
People use AI automation to skyrocket their productivity. But these AI workflow builders like n8n or Zapier are very complicated and you have to spend a lot of time to learn how to use them then set up these complex workflows and fix all the problems and bugs that show up. You also have to handle variables, API keys, if else statements. It's almost like real programming. But what if you could record your screen just once, explain what you're doing, and then the AI will watch it and build the automation for you?
And if you need to change or fix something, you just chat with an AI agent, and it makes the changes for you. Today, I'm going to show you a tool that does exactly that. And together, we will look at four real-world workflows. We will build a web scraper that gets the latest prices from Amazon and stores them in a Google sheet. an invoice processor that automatically downloads invoices from my Gmail account, uploads them to Google Drive, and stores all the costs in a Google sheet.
Three, a daily AI news briefing that scrapes the latest AI news from the internet and then sends me a summary email once a day. And lastly, a lead generator that automatically finds email addresses for the latest AI tools and drafts a personalized email to them. We will see how the tool handles navigation, login, and even CAPTCHAs and email one-time passwords. I'll show you all the features and how to use them. And at the end of this video, you will know exactly if this tool is the right fit for you.
My name is Florian Walther and this is the AI tool corner where I review the latest AI software to find out which ones can actually improve our lives and businesses. The tool is called Komos AI, AI agents that learn from watching your work. I'm going to log into my account. Then we get to the dashboard. And here we can set up different AI agents that can do all kinds of work for us.
We could build these agents manually, but the easier way is to use Moss, which is the AI agent. We will build one workflow together from scratch and then I'll just show you the other ones I have prepared. For this first workflow, I want to scrape Amazon.com for the prices of this graphics card RTX 5070. You could use this, for example, to get price alerts when something gets cheaper or to monitor your competitor's prices. We could now chat with Moss via text and tell it exactly what we want to do.
But the even easier way is to record our screen and just show it. So, the mic is configured. We are sharing the screen and I have these floating controls here. So, I can just switch over to amazon.com and then when I'm ready, click start recording and explain what I want the AI automation to do. Let's give this a try.
I want you to go to amazon.com and search for the RTX5070 graphics card. Then I want you to note all the prices for this graphics card by the different manufacturers. So for example here we have Gigabyte, we have MSI, PNY and so on. Ignore duplicates. If we have, for example, Gigabyte two times, I want you to only note them once.
Also ignore any product that's not a graphics card. If there are any other products in here or any combinations like complete PCs, I want you to ignore them. I only need the graphics card and the manufacturer and the price. Do this for the first page and also the second page. So down here you can navigate to page two and again repeat the same process.
Note the prices for all the graphics cards by the different manufacturers. Then I want you to go into this Google Sheets file called RTX 5070 prices. The Google Sheets integration is already set up. So you should have access to this. Open it and then you will find a template in here.
Here in the first line, enter the current date and then the prices for the graphics card for all these different manufacturers. So for example, today is the 10th March and then I want you to enter PNY and whatever price the graphics card had on that day. Enter it here in dollar amounts. Do this for all graphics cards that you found, but without duplicates. Do this every day at 8:00 a.m.
And also send me an email to florian@codinginflow.com with the scraped data. If you need any other information from me, just tell me in the chat. When I'm done, I click stop. The max duration for these recordings is 5 minutes at the moment, but this is more than plenty. I didn't even use half of it.
So now this session recording is attached to the chat. And then I just tell the AI to follow the instructions in the attached screen recording and send the summary email to florian@codingflow.com and send this to the agent. Now the agent analyzes the screen recording and then builds the automation. Now in the instructions I mentioned the Google Sheets integration. We will take a closer look at these integrations and how to set them up later.
Okay. Okay, so now the bot summarized what we are trying to do and what it saw in the screen recording. But I wasn't clear enough in my instructions. The AI thinks I want a summary of the screen recording sent to this email address. So I'm going to clarify this.
Build the automation, run it every day at 8:00 a.m. and send the scraped data to florian@codinginflow.com. Now we can see that the agent is building the automation and we have to wait a few minutes. It actually took only 1 minute. And now we have our automation here and we can open it.
Here we have a summary of what it does. And then we can take a look at the actual workflow. This is what we would have to build ourselves in the past, but the AI built it for us. Of course, this is a rather simple workflow. We will look at more complex ones later.
And here when we open this note, we can take a closer look at the instructions. This was all generated by the AI from my screen recording. So it explains that it has to go to amazon.com and search for RTX5070. It scrapes all the data with the unique manufacturer names. It explains how it navigates to different pages, deduplicates manufacturer names, and stores the data in Google Sheets, what format and so on.
Down here we can see the integration it uses. It does web search file operations. It uses Gmail and Google Sheets to store the data and send the email to me. We have output variables, but we don't need to care about this because this is taken care of by the AI for us. We don't want to have to handle variables ourselves, right?
So, let's not worry our beautiful heads about it and give this a test run. There is just one little mistake in here. It goes to Amazon DE instead of com. Maybe I opened the E. I don't remember.
But the cool thing is we can now tell the AI to change this. Actually, I want to scrape Amazon.com and not Amazon.de. I use Wispr Flow for the voice input. I have a video on that as well. And now the agent will change our workflow to Amazon.com.
It's done. We have to confirm the changes. Let's take a look into this note. And indeed, it's updated to Amazon.com. And now we can give this a test run up here via the run draft button.
And now we can actually watch the AI execute this workflow in the web browser. And we can observe all the actions it takes here in this agent action tab. So, it starts by opening amazon.com and searching for RTX 5070. And now the AI will use the browser to extract all the data. This takes a while, but of course, we don't have to keep watching this.
We can just go about our day. But it's still interesting to see what exactly it does. So, we have these different tool cards here and we can see the details about them. We can see the exact reasoning for all the steps it takes and watch it run the automation live. Now, of course, some websites require you to log in before you can use them.
And this is also possible. We can set up credentials and I will show you an example later. Now, we are on page three. So, the agent knows how to navigate between pages and then it goes ahead and extracts the data in all these pages. You can see a summary of everything it did.
And then let's wait until it's finished. So, now we can go into Google Sheets and we find the inserted data right here. Now, it didn't delete the template data, which is fine. I didn't tell it to do it. But if we remove this and move these up here, then we have our price data for today.
And there are no duplicates in here. And now we could run this automatically every day. And we will get a new column for each of these different days. And we can observe the prices. We also got two emails.
One contains the summary that I told the bot to send me. And we also get a summary of the run itself. so we know whether they succeeded or if the bot ran into any problems. I will show you even cooler examples in just a few minutes. So, make sure to watch the whole video.
In the workflow, we can take a look at the run settings and the different options we have available. Right now, we run this every day at 8:00 a.m. We could also trigger this via an email. So, we get a special email address and whenever we send an email there, this automation will run. Or we could trigger it via an API request.
This is useful if you are developing an app and you want to trigger these automations from within your app programmatically. When the automation is finished, you can also publish it, which basically creates a version history. But as you saw, this also works if we just run it as a draft. Let's take a look at some other automations I've prepared. For example, this invoice processing automation.
Of course, I created this via the agent. I didn't build this myself. And all these instructions were generated by Moss. So this one searches my Gmail account for unread emails that contain an invoice. So it checks the PDF file name and it also analyzes the email content itself to decide if it's actually an invoice or something else.
It actually opens the PDF, extracts all the invoice data and puts it into Google Sheets and it also stores the complete invoice in Google Drive and then it marks the email as red and gives it the invoices label so everything is organized. Let's give this a test run. And in my inbox, I have three emails that contain invoices. And these invoices also have a different structure. We also have this one email that has a PDF attachment, which is not an invoice, to see if the agent can properly distinguish them.
Let's give this a test run. Again, it opens a browser, but this time it will actually not browse any websites because everything can be handled directly via integrations. What's also interesting is that the AI actually learns from past runs and it stores memories and learnings about the past. So for example here earlier it ran into a 404 error but it learned from it and store this information so it doesn't make this mistake again. So these agents learn and they can figure out workarounds for problems autonomously.
So now after it's finished we should see three things. one in our email inbox. The three invoice emails are marked as read and organized into the invoices label. The email that contains not an invoice was ignored. Perfect.
Next, in our Google Drive folder, all the invoices should be stored in this invoices folder. And there they are, extracted from our email. And in Google Sheets, we should have entries in this costs sheet. And there we go. It extracted the company name, the invoice date, and the amount we have to pay from the invoices.
But we can get way more advanced than that. This Medium automation scrapes the top 10 articles in the AI niche from medium.com. It summarizes them and then sends me a digest email. What makes this so tricky is that one, we have to log into our account to even access these articles. But the login process is also quite complicated because it doesn't have a dedicated login page.
Instead, it opens this dialogue. We have to click on sign in with email. And even worse, we can't enter a password here. Instead, we get a confirmation email sent to our inbox with a code. And then we have to enter the code here in order to get into our account.
Can the bot handle this? Spoiler alert, it can. For this, I had to set up credentials which we can find in our sidebar here. So, I set up this medium.com credential with the website and the login URL. But again, the login URL is actually the root URL because it's a login dialogue and not a dedicated page.
I entered my account's email address here. And I also entered the password, but this is actually a fake password because remember Medium doesn't use password login. It uses email verification. Luckily, Komos has a feature for that. So down here under two factor authentication, I set up email one-time password.
Then you get the special address where you have to forward the verification email to. You can find the instructions for Gmail and Outlook here. This is just email forwarding. You have to set this up. It's not very complicated.
Just follow the step-by-step instructions. And I set this up to forward all these medium.com verification emails to this address. The bot can then automatically read and extract the code and use it for login. Down here, I also entered some usage notes for the AI and the bot actually reads these instructions. So, I told it the login form is inside a dialogue.
You have to click the sign-in button which opens the dialogue and then you click on the email option and you will find the input field. Ignore the password I gave you because this site uses an email code for every login. No password. The code will be sent to the connected OTP email address. So those are instructions for the AI.
This workflow is a bit more involved but again the agent build this for me. I didn't set this up manually. Let's give this a test run. Again we can watch the bot live execute all the steps. So now it's on the medium.com landing page.
Now it has to log in. So the bot opened the dialogue. Now it has to click on the sign in with email option. We can see it moving the mouse live. It clicked the correct button.
Now it should enter the account's email address. And I'm not doing anything. All of this happens automatically. There we go. It enters it.
Click on continue. And now comes the tricky part. We receive a one-time password via email. It gets forwarded to Komos's inbox and then it should enter the code automatically. There we go.
This is so impressive to watch. And boom, we are inside our account. Would you look at that? Now it will open each of the top 10 articles, summarize them, and then send me an email. By the way, if you are worried about inserting your login credentials here, they are encrypted, which means that no one can actually see them and they can't be leaked.
So the Komos team cannot see your raw password. Here in the actions, we can see that it extracts the article URLs. So it can then open them one by one and summarize them. The automation has finished successfully and the result is this email. Super cool.
So we have the link to each article that we can open directly on medium.com and then we have an AI generated summary in bullet points of each of these articles and this is the kind of stuff you can do with these automations which can make you so much more productive. Let's take a look at one more automation and this will be the peak of this video because this automation contains the most steps and uses the most tools. This one scrapes product hunt for the most popular AI tools of yesterday. For this, it has to enter a dynamic URL. As you can see here, that's the product hunt URL.
And here it contains the year, the month, and the day. And I told the AI to fill this out automatically with yesterday's date. Then open the first five products. Actually open the website of each product and then find an email address for each of them. For this, it will actually search through the website on different pages like contact, about team or the footer for an email address.
Then if it can't find an email address, it will search hunter.io. And hunter io is a tool where you can find email addresses by the company URL. For example, if I enter openai.com, I can find the email addresses here. If our automation doesn't find the email address on the website, it will make a request to Hunter.io and try to get it from there. From the website's content, it will also create a summary of what the product does.
Then it will enter all of this information into a Google sheet. And finally, if an email exists for this tool, it will draft a personalized email to the founders with a collaboration invite. We will run this and then I will show you the result. And because this contains multiple steps and even a loop, it has to work with different variables. But again, we don't have to understand this.
The bot set all of this up and I don't even have a clue how this works. I only know that it works. So, let's give this a test run. And Product Hunt actually has a CAPTCHA that the bot has to resolve. Let's see if it can do that.
And voila, we are in. Now, heads up, it doesn't work that well on all websites. Some websites, especially these very large social media platforms like LinkedIn and X have very strong bot protection and I wasn't able to get through that with Komos AI. So ideally, you want to have an integration that you can use because then it will always work. We can see it opening and navigating through each of these products websites and it actually searches for the email address.
It will look in the footer contact pages. Right now it's in the FAQ to find an email address. So the agent is done and the result is this. In this leads sheet, we now have each of these tools. It's only five because I actually changed it to only scrape the first five elements.
We have a short summary that was generated from the content of the website by running it through an LLM. And we have a contact email if it could find one. For Entropic, it didn't find one. And here is the mind-blowing part. In my Gmail drafts, I now have four drafts for these four tools where I found an email address with a personalized introduction email.
For example, I want to talk about Fish Audio. I've been checking out Fish Audio S2 and really like to focus on high-quality AI, text to speech, blah blah blah. I would like to have a call and talk about ways to collaborate. The same one for this tool and for this one, all with a personalized first line. This is so cool.
I will give you my full summary of Komos AI in just a minute. But first, let's take a look at the available integrations. So throughout this video, we used the Gmail, Google Drive, and Google Sheets integrations, but there are many other ones available that you can find here in this integrations list. And they keep adding more and more. To connect an integration, just click on connect, which will then forward you to an authentication screen where you just have to log in with your account.
It's extremely simple. Then just tell the AI agent to use the integration like I did in the beginning. If there is no integration available but you have an API key, then you can also give this to the agent directly and tell it to use the API. Let's take a look at these other features here in the sidebar. So we can set up schedules which we did for our daily price tracker and medium digest.
This will run every day at 8:00 a.m. or 9:00 a.m. And you can update or delete this. You can add new schedules. In the skills tab, we can give our AI additional instructions which are basically just prompts that we can then reference in our automations.
You can set this up here, give this a name, and then just tell the agent to use this skill. But I haven't needed this yet. As I mentioned earlier, Komos also offers an API and webhooks. If you want to trigger an automation from within your app, you can set this up here, but this is too technical for this video. You can find the instructions in the documentation.
Everything is explained here. Lastly, you can set a self-hosted runner here under environments. So, by default, our automations run in this managed sandbox which is provided by Komos. And this is all you need to run these browser automations. A self-hosted runner is useful.
If you use Komos within your company and you need to access an intranet (an internal corporate network), then you can install a self-hosted runner directly on your machine and it will have access to that intranet. And this way it can also reuse existing login sessions. But I didn't set this up here. I used the managed sandbox throughout this video. Okay, here is my final summary of Komos AI and who should use this tool and who shouldn't.
The screen recording feature worked extremely well. I just explained to the bot verbally what I needed to do and I didn't need to edit any of these workflows manually. Any changes that I needed were also done by the Moss agent. The agent understands tasks really well and it can even handle complex multi-step workflows with lots of instructions. It can handle logins with CAPTCHAs and email-based one-time passwords.
However, this will usually not work for a big website like LinkedIn because they have really sophisticated bot protection. That's why you should use integrations whenever possible. The bot can make decisions autonomously, summarize information, and you can trigger these workflows from a schedule, via an email trigger, or even via the API. And if you need to access a local network, for example, on your company's intranet, you can set up a self-hosted runner. The only limitation is that these automations only run within a web browser.
So, this will not have access to your computer. If you need to manage any files directly on your machine, Komos can't do that. But the benefit is that this is more secure because this runs in a sandbox, which means that it can't do any damage on your machine. I will put a link to Komos AI into the video description below. They have a 7-day free trial, so you can try this out yourself.
Also, if they ask you where you found this tool, it would be nice if you mention my name, Florian, from the AI Tool Corner YouTube channel. And if you want to see a comparison to other AI workflow builders, I have a full playlist that I will link here in the end card. Check this out next. Then I wish you a nice rest of the day. Take care.
Standout features
What it's great for
- Scrape product prices from websites and store daily results in Google Sheets
- Process invoice emails by extracting PDF data, saving files to Google Drive, and updating a cost tracker
- Create a daily AI news briefing from articles behind a login and send the summary by email
- Find leads from Product Hunt, search websites for contact emails, and draft personalized outreach messages
- Trigger repetitive browser-based workflows from schedules, incoming emails, or API requests
Pros & cons
Best for
Verdict
Komos AI is a strong fit when the painful part of automation is translating a real browser task into workflow-builder logic. It is not a universal desktop automation tool, and bot-protected sites can still be a problem, but for repetitive SaaS and browser workflows it offers a much easier starting point than building every step manually.
FAQ
What is Komos AI used for?
Komos AI is used to automate repetitive browser and SaaS workflows. You can record a task, explain what should happen, and let the AI create automations for scraping data, processing emails and documents, updating spreadsheets, sending summaries, or drafting outreach.
Do I have to build Komos workflows manually?
No. You can build workflows manually, but the main advantage is the Moss agent, which can watch a screen recording, understand your spoken instructions, generate the workflow, and update it later through chat.
Can Komos AI handle website logins?
Yes, for many sites. Komos can use encrypted credentials and can handle email one-time passwords through forwarded verification emails. Very large platforms with strong bot protection may still block automation, so native integrations are preferable when available.
How can Komos automations be triggered?
Automations can be run manually, scheduled for a specific time, triggered by sending an email to a generated address, or started programmatically through the API and webhook setup.
Can Komos control files on my computer?
No. Komos runs browser automations in a sandbox, which is safer but means it cannot directly manage local desktop files. For company intranet access or existing login sessions, a self-hosted runner can be configured.
