AI Agents & Automation
Browsing page 81 of AI tools for General-Purpose Agents in AI Agents & Automation. Sorted by confidence score — our independent quality rating.
desk-emoji
Desk-Emoji is a truly open-source AI desktop robot designed with an industrial-style aesthetic, making it a sleek desktop decoration. It boasts unparalleled cost-effectiveness, aiming to deliver the performance of more expensive desktop robots at a fraction of the price. Key features include a 2-degree-of-freedom gimbal and versatile head movements, enabling dynamic interactions. The robot is equipped with finely tuned emoji animations and motion algorithms for smooth and lively emotional expressions. It can respond with corresponding actions based on the emotional tone of replies and supports gesture recognition for interactive engagement. Furthermore, Desk-Emoji is compatible with large-scale model voice conversations, integrating LLM capabilities for comprehensive voice chat.
esp-who
ESP-WHO is an image processing development platform built upon Espressif chips, offering a robust framework for AI-powered vision applications. It includes development examples for key functionalities such as human face detection, human face recognition, and pedestrian detection, enabling developers to create a wide range of practical applications. The platform is based on ESP-DL and supports various peripherals, allowing for interesting integrations. Recent updates include full refactoring, support for the new ESP-DL and ESP32-P4 chip, asynchronous camera and deep learning model operation for higher FPS, and integration with lvgl for graphical applications. It also features a new pedestrian detection model, making it a comprehensive solution for embedded vision projects.
susi_gassistantbot
susi_gassistantbot is an open-source project designed to integrate SUSI AI with Google Assistant, enabling developers to create custom voice-controlled applications and AI agents. The project provides a framework for building functionalities on Google Assistant using the SUSI AI platform. It requires setting up a project on Google's Actions console, configuring API.AI (now Dialogflow) with intents and webhooks, and deploying the application to a platform like Heroku. This tool is ideal for developers looking to extend Google Assistant's capabilities with custom AI logic from SUSI, offering a flexible way to build interactive voice experiences.
Phronetic AI
Phronetic AI is a platform designed for building and deploying AI agents, particularly focusing on high-stakes environments like financial systems and national security. It emphasizes zero-trust architecture and air-gapped deployment for enhanced security. The platform offers specialized agents like ClipGen for video creation, Talkument for document interaction, and Codeshwar for AI development. Phronetic AI provides solutions tailored for the BFSI ecosystem, including banking, lending, financial services, insurance, and payments, with agents trained on domain-specific workflows and regulatory requirements. It also supports air-gapped environments for classified document processing and secure communications analysis, ensuring 100% offline operation.
JobHire.AI
JobHire.AI is an AI-powered career assistant designed to streamline the job search process. It automates job applications, allowing users to apply to hundreds of jobs matching their criteria without manual effort. The platform includes an AI resume builder and cover letter generator to optimize applications, bypass ATS filters, and increase interview chances. Users can track their application activity through a built-in dashboard, saving significant time. JobHire.AI aims to make job searching more efficient and effective, offering features like resume matching and score checks to boost career growth.
tensorforce
Tensorforce is an open-source deep reinforcement learning framework built on TensorFlow, designed for both research and practical applications. It stands out for its modular, component-based design, allowing for highly configurable feature implementations. A key differentiator is the separation of the RL algorithm from the application, making algorithms agnostic to input and output structures. The entire reinforcement learning logic, including control flow, is implemented in TensorFlow, enabling portable computation graphs. It supports a wide range of features including various network layers, memory types, policy distributions, reward estimation, training objectives, and optimization algorithms. Tensorforce also offers extensive exploration techniques, preprocessing options, and regularization methods, making it a versatile tool for developing and training reinforcement learning agents.
SeeAct
SeeAct is a system designed for generalist web agents, allowing them to autonomously execute tasks across various websites. It primarily utilizes large multimodal models (LMMs) such as GPT-4V(ision) to power its capabilities. The system features a robust code execution environment and a sophisticated grounding mechanism, ensuring effective and reliable interactions with web interfaces. SeeAct is particularly well-suited for researchers and developers who are focused on advancing the field of web automation and creating intelligent agents that can navigate and operate within complex online environments. Its focus on LMMs provides a cutting-edge approach to web agent development.
IntellibizzAI
IntellibizzAI specializes in building intelligent identity systems for modern brands, leveraging AI precision with a human touch. The platform offers a comprehensive suite of services including brand positioning, premium website design, content intelligence, and visibility architecture. It caters to founders, creators, and boutique brands looking to enhance their online presence, convert visitors, and grow on social media. Key offerings range from identity system clarity and narrative development to high ROI visibility systems and AI visual narratives, all designed to deliver tangible results and sustained growth.
Game-Bot
Game-Bot is an open-source project designed to teach artificial intelligence how to play video games by observing human interaction. The system works by recording a user's keyboard and mouse movements during gameplay, creating a dataset that is then used to train a deep learning model. Once trained, the AI can replicate the human player's actions and play the game autonomously. This tool provides a foundational framework for AI-driven game automation and research, leveraging deep learning techniques with neural networks. It is tested with Python 3.6.0 and requires specific module installations, making it suitable for developers and researchers interested in AI and gaming.
Ensemble-Pytorch
Ensemble-Pytorch is an open-source, unified ensemble framework designed for PyTorch to enhance the performance and robustness of deep learning models. It allows users to easily integrate various ensemble strategies, such as Voting, Bagging, Gradient Boosting, and Snapshot Ensemble, into their existing PyTorch workflows. The framework supports both classification and regression problems and provides a straightforward API for defining, optimizing, training, and evaluating ensembles. It is part of the PyTorch ecosystem, ensuring good maintenance and compatibility. With Ensemble-Pytorch, developers can leverage advanced ensemble techniques to achieve more reliable and accurate AI models.
LMDrive
LMDrive is an open-source, closed-loop, end-to-end autonomous driving framework that leverages large language models (LLMs). It is designed to interact with dynamic environments by processing multi-modal and multi-view sensor data, alongside natural language instructions. This framework facilitates the development and research of advanced autonomous driving systems. Key features include vision encoder pre-training to generate visual tokens from sensor inputs and an instruction finetuning stage to align language instructions with control signals. The project provides a comprehensive dataset collected in the CARLA simulator, including sensor data, navigation instructions, and human notice instructions, making it a robust platform for researchers and developers in the autonomous driving domain.
mistral-inference
mistral-inference is an official open-source inference library developed by Mistral AI, designed to provide minimal code for running their large language models. This library supports a wide range of Mistral models, including Mistral 7B, Mixtral 8x7B, Mixtral 8x22B, Codestral 22B, Codestral Mamba 7B, and Mathstral 7B, as well as newer models like Mistral Large 2 and Mistral Small 3.1. It facilitates local installation via PyPI or direct cloning from GitHub, with model weights available for direct download or from the Hugging Face Hub. Users can interact with these models through a command-line interface for demos and interactive chat, supporting both single and multi-GPU setups. The library also provides Python APIs for instruction following, multimodal instruction following, and function calling, making it a versatile tool for developers working with Mistral's AI models.
Self-Driving-Car-in-Video-Games
Self-Driving-Car-in-Video-Games is an open-source project featuring a supervised deep neural network designed to learn autonomous driving within video games, specifically Grand Theft Auto V. The model, named T.E.D.D. 1104, is trained using extensive human-labeled data, recording gameplay and key inputs to teach it how to navigate various vehicles under different weather conditions. It approaches the task as a classification problem, taking a sequence of five images as input and predicting the correct keyboard or Xbox controller inputs. The project provides pretrained models of varying sizes (XXL, M, S) and includes all necessary files for data generation, training, and real-time inference, primarily supporting Windows 10/11 for gameplay interaction.
TPVFormer
TPVFormer is an academic project offering a Tri-Perspective View (TPV) representation for vision-based 3D semantic occupancy prediction, serving as an alternative to Tesla's Occupancy Network for autonomous driving research. It addresses the limitations of traditional bird's-eye-view (BEV) representations by incorporating two additional perpendicular planes, allowing for a more fine-grained description of 3D scenes. The tool features a transformer-based TPV encoder (TPVFormer) to effectively obtain TPV features by aggregating image features. It demonstrates that camera inputs alone can achieve performance comparable to LiDAR-based methods on LiDAR segmentation tasks. The project also includes resources for semantic scene completion and comparisons with Tesla's Occupancy Network.
Tetris-deep-Q-learning-pytorch
Tetris-deep-Q-learning-pytorch is an open-source Python project that demonstrates the application of Deep Q-learning for training an AI agent to play the classic game Tetris. Developed with PyTorch, this tool serves as a foundational example of reinforcement learning in action. Users can leverage the provided source code to train their own Tetris-playing models from scratch or test pre-trained models. The project includes all necessary scripts for training and testing, making it accessible for those interested in understanding and experimenting with AI agents and deep learning techniques in a practical gaming context. It's an excellent resource for students and developers exploring the basics of reinforcement learning.
Interloom Technologies
Interloom Technologies provides an AI-driven platform designed to supercharge back office operations by leveraging AI that learns how businesses actually run. Unlike most automation platforms built for developers, Interloom empowers subject matter experts to design workflows, set governance, and build automations without needing to write code. The platform utilizes a 'context graph' that accumulates knowledge from every case resolved, enriching a living record of decisions and outcomes. AI agents then act on this context to handle tasks like extraction, triage, and follow-ups, ensuring actions are grounded in business-specific knowledge rather than generic data. Interloom integrates with existing tools like SharePoint, Salesforce, SAP, Microsoft Teams, Confluence, Google Workspace, ServiceNow, Jira Service Management, and Slack, reading, writing, and syncing data across the tech stack.
Genie-TTS
Genie-TTS is an open-source, lightweight inference engine and model converter specifically designed for GPT-SoVITS ONNX models. It excels in providing near-instantaneous speech synthesis on CPUs, making it highly efficient for various applications. The tool integrates essential functionalities such as TTS inference, ONNX model conversion, and an API server, all aimed at delivering ultimate performance and convenience. It supports GPT-SoVITS V2 and V2ProPlus models, with planned support for V3 and V4, and handles Japanese, English, Chinese, and Korean languages. Genie-TTS also offers significant performance advantages over official PyTorch models, particularly in first inference latency and runtime size, making it an ideal solution for developers and content creators seeking high-performance, CPU-based speech synthesis.
ResnetGPT
ResnetGPT is an open-source project built with Resnet101 and GPT, designed to create an AI capable of playing the mobile game Honor of Kings. Developed using the PyTorch framework, it leverages a pre-trained Resnet101 model and a Transformer-based decoder for game actions. The project provides code for training the AI with gameplay data, including scripts for data capture and preprocessing. While the project is no longer actively updated, it serves as a foundational example for developing AI agents for complex game environments, requiring a dedicated NVIDIA graphics card and an Android device for operation.
Poker
Poker is a fully functional poker bot designed to automate gameplay on popular platforms like PartyPoker, PokerStars, and GGPoker. It employs advanced image recognition techniques, including Open-CV or neural networks, to scrape table information. Decisions are then made using a sophisticated combination of genetic algorithms and Monte Carlo simulations for accurate poker equity calculation. The bot can operate for extended periods, moving the mouse automatically based on a large number of adjustable parameters. Users can download binaries for direct execution and even run the bot within a virtual machine to prevent interference with their main computer. It also features a strategy analyzer and editor, allowing for customization and optimization of playing strategies.
opencontrol
OpenControl enables users to manage their infrastructure using AI, offering a self-hosted solution that integrates directly with internal resources and codebase. It generates a single HTTP endpoint, acting as a unified gateway that can be chatted with or registered with any AI client, exposing all your connected tools. The platform is universal, supporting tool calling with models from Anthropic, OpenAI, or Google, and ensures security through authentication via any OAuth provider. It can be deployed to AWS Lambda, Cloudflare Workers, or containers, and provides examples for integrating with AWS, Stripe, and SQL databases, making it a flexible solution for developers looking to automate infrastructure management.
notion_widgets
notion_widgets is an open-source collection of HTML widgets designed to enhance Notion.so pages. Users can embed these widgets to add various interactive and functional elements to their Notion workspaces, customizing their experience beyond Notion's native capabilities. The project includes a diverse range of widgets such as calendars, countdown timers, currency converters, weather displays, and more, providing practical tools for organization and productivity. By offering these HTML-based solutions, notion_widgets empowers users to create more dynamic and personalized Notion environments, making it a valuable resource for those looking to extend their Notion functionality.
Pony.ai
Pony.ai is a leading global autonomous driving technology company founded in 2016, focused on bringing safe, sustainable, and accessible autonomous mobility to the world. The company develops a full-stack autonomous driving technology, leveraging its core "virtual driver" system. Pony.ai has accumulated millions of kilometers in autonomous road testing in complex scenarios, including challenging weather and road conditions, and has secured licenses to test and operate autonomous vehicles globally. Its business units include Robotaxi for everyday travel, Robotruck for commercial logistics, and solutions for Personally Owned Vehicles (POV), aiming to deliver superb autonomous driving solutions across various industries and markets.
AiDASH
AiDASH is an enterprise SaaS company offering satellite-first AI applications for the remote inspection and monitoring of critical infrastructure. The platform provides solutions like the Intelligent Vegetation Management System (IVMS™) to prevent utility-related wildfires, the Wildfire Mitigation Planning Services (WMPS), and the Climate Risk Intelligence System (CRIS™) for wildfire and storm resilience. It also includes an Asset Inspection and Monitoring System (AIMS) and a Biodiversity Net Gain Management System (BNG AI™). AiDASH helps industries such as electric utilities, gas utilities, water and wastewater, energy, mining, and transportation to improve reliability, reduce costs, and ensure compliance by providing accessible, actionable, and compliant data from space.
AiAlly
AiAlly offers AI employees designed to revolutionize business operations by boosting productivity and streamlining workflows. These self-learning AI agents continuously adapt to a company's unique needs and integrate seamlessly with existing tools. Users can customize AI personalities to fit their company culture, fostering natural and engaging interactions. AiAlly's AI employees autonomously tackle complex tasks, make advanced decisions, and collaborate effectively with both human and other AI team members. The platform emphasizes enterprise-grade security with end-to-end encryption and compliance with global data protection regulations, ensuring data privacy and integrity.