🤖

AI Agents & Automation

Browsing page 87 of AI tools for General-Purpose Agents in AI Agents & Automation. Sorted by confidence score — our independent quality rating.

All AI Frameworks & Infra Browser & Web Agents Chatbots & Conversational AI General-Purpose Agents Multi-Agent Systems Personal Assistants RAG & Document AI RPA Scheduling & Task Agents Voice Agents Workflow Agents

Adept

58%

Adept is an enterprise AI tool designed to significantly enhance workforce productivity by automating manual and repetitive workflows across an organization's existing software stack. Leveraging proprietary agent training data, multimodal models, and custom actuation software, Adept's agentic AI capabilities translate user intents directly into actions. Key features include accurately locating items on web pages or applications (Adept Locate), reasoning and answering questions about various documents (Adept Web VQA), and planning and executing complex end-to-end enterprise workflows. It is built to be accurate, reliable, and future-proof, requiring minimal maintenance and allowing for quick setup of new workflows using natural language instructions.

LearningHumanoidWalking

58%

LearningHumanoidWalking is an open-source project dedicated to advancing humanoid robot locomotion through deep reinforcement learning. The repository provides comprehensive code implementations for various research papers, focusing on robust walking capabilities on challenging terrains and incorporating current feedback for bipedal control. It supports different humanoid robot environments, including JVRC and Unitree H1, and offers task definitions, reinforcement learning components, and robot abstractions. Developers can easily add new robot models and configure environment behaviors via YAML files. The project includes examples for basic standing, walking, stepping, and even a Cartpole swing-up task for testing the RL pipeline, making it a valuable resource for researchers and developers in robotics and AI.

Memory-Cache

58%

Memory-Cache is an experimental open-source project designed to transform a local desktop environment into an on-device AI agent. It functions by allowing users to save webpages as PDFs directly from Firefox. These saved PDFs are then synchronized to a specific folder, which can be integrated with privateGPT to augment a local language model. This setup enables users to leverage their browsing history and saved content to enhance the capabilities of their local AI agent. The project requires setting up privateGPT, creating symlinks for content synchronization, and applying a patch to Firefox for silent PDF saving. It provides a unique way to build a personalized knowledge base for an AI agent from everyday web browsing.

Medical-SAM2

58%

Medical-SAM2, or MedSAM-2, is an advanced segmentation model built upon the Segment Anything Model 2 (SAM 2) framework. This tool is specifically designed to address both 2D and 3D medical image segmentation tasks, including the analysis of medical images as video. It provides a robust solution for precise image segmentation, which is crucial for AI-driven diagnostics and medical research. The project offers pre-trained weights and detailed instructions for setting up the environment and running example cases, such as REFUGE Optic-cup Segmentation from Fundus Images and Abdominal Multiple Organs Segmentation. Its capabilities are elaborated in the paper "Medical SAM 2: Segment Medical Images As Video Via Segment Anything Model 2."

maml_rl

58%

maml_rl is a code repository designed for researchers and developers working with reinforcement learning. It implements the experiments described in the paper "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks" (Finn et al., ICML 2017). The tool specifically supports few-shot reinforcement learning, enabling deep networks to adapt quickly to new tasks. Built upon the rllab framework, maml_rl requires TensorFlow v1.0+ and is compatible with OpenAI Gym. While powerful, the current implementation is noted for being slow, and contributions to improve parallelization and speed are welcomed by the developers. It's an essential resource for those exploring meta-learning in the context of deep reinforcement learning.

mAP

58%

mAP (mean Average Precision) is an open-source Python code designed to evaluate the performance of neural networks in object recognition tasks. It calculates the mAP value, a crucial metric in computer vision, by first determining the Average Precision (AP) for each class present in the ground-truth data. The tool then computes the mean of all APs to provide an overall performance score, ranging from 0 to 100%. This evaluation method is based on the PASCAL VOC 2012 competition criteria, ensuring a standardized and robust assessment of object detection models. Users can easily integrate their ground-truth and detection-results files to run the evaluation, with optional features for plotting results and animation.

MovieChat

58%

MovieChat is an open-source AI tool designed for long video understanding, capable of processing videos with more than 10,000 frames while maintaining low GPU memory usage. Published at CVPR 2024, it introduces a novel approach from dense token to sparse memory, offering a significant advantage over other methods in terms of memory efficiency. The tool provides capabilities for video question answering, benchmark evaluation, and supports various video analysis tasks. It is available on GitHub and can be easily installed via pip, making it accessible for researchers and developers working with extensive video datasets. MovieChat also includes a MovieChat-1K benchmark for evaluating long video understanding models.

navsim

58%

navsim is an open-source platform designed for autonomous driving simulation and benchmarking. It introduces Pseudo-Simulation, a novel evaluation methodology that merges the efficiency of open-loop evaluation with the robustness of closed-loop evaluation. By augmenting real data with synthetic observations, navsim achieves strong correlation with traditional closed-loop simulations while significantly reducing computational resources. This tool is ideal for researchers and developers in the autonomous driving field, providing a faster and more scalable approach to validate AV algorithms and behaviors. It supports data-driven, non-reactive autonomous vehicle simulation and benchmarking, making it a valuable resource for large-scale, rapid validation.

OccWorld

58%

OccWorld is an open-source 3D world model specifically designed for autonomous driving applications, presented at ECCV 2024. This tool allows for the joint modeling of 3D scene evolutions and ego movements, crucial for developing advanced autonomous systems. It can forecast the movements of surrounding agents and future map elements like drivable areas, demonstrating an understanding of the scene beyond mere memorization. OccWorld integrates with various 3D occupancy models such as SelfOcc, TPVFormer, and SurroundOcc, offering a scalable solution for large-scale training and paving the way for interpretable end-to-end large driving models. The project provides code for visualization, training logs, and a pretrained model, making it a valuable resource for researchers and developers in the autonomous driving domain.

odas

58%

ODAS, which stands for Open embeddeD Audition System, is a robust open-source library designed for advanced audio processing tasks. It specializes in sound source localization, tracking, separation, and post-filtering. Developed entirely in C, ODAS prioritizes portability and is highly optimized to run efficiently on low-cost embedded hardware, making it suitable for a wide range of embedded systems and robotics projects. The project provides comprehensive documentation on its wiki for building and running the software, and also links to related projects like `odas_ros` for ROS integration and `odas_web` for a graphical user interface for data visualization. Additionally, IntRoLab offers open-source hardware, such as the 8SoundsUSB and 16SoundsUSB configurable microphone arrays, to complement the system.

openDAW

58%

openDAW is a next-generation web-based Digital Audio Workstation (DAW) committed to making music production accessible to everyone. It emphasizes education and data privacy, operating under an open-source AGPL v3 license. The platform boasts a 'Built on Trust and Transparency' philosophy, featuring no sign-ups, tracking, cookie banners, user profiling, terms & conditions, ads, paywalls, or data mining. Key features include a variety of built-in devices like Vaporisateur (subtractive synth), Playfield (sample drum computer), and Dattorro Reverb, alongside MIDI effects. The project actively seeks contributors for areas such as offline app development (e.g., with Tauri), PWA implementation, and timeline track management. It also offers a commercial license option for those wishing to integrate openDAW into closed-source projects.

OpenCat-Old

58%

OpenCat-Old is an open-source project providing a programmable and highly maneuverable robotic cat platform. It is designed for STEM education and AI-enhanced services, targeting skilled makers interested in quadruped robots. The platform facilitates collaboration among talents to develop this cute robot. While this specific repository is noted as obsolete and redundant with large image files, it served as the foundation for the OpenCat project, aiming to make complex robotic systems accessible through mass production and cost reduction. Users can find resources and updates on the official Petoi website and social media channels.

PRMLT

58%

PRMLT is a Matlab package designed to implement machine learning algorithms as detailed in C. Bishop's influential textbook, "Pattern Recognition and Machine Learning" (PRML). This open-source package is entirely written in Matlab, ensuring it is self-contained with no external dependencies beyond Matlab itself. It requires Matlab R2016b or a later version, specifically utilizing features like implicit expansion (broadcasting) and requiring the Statistics and Image Processing Toolboxes for certain functionalities. The design prioritizes succinctness, efficiency through vectorization and matrix factorization, numerical robustness, and readability with heavy commenting and formula annotations. It aims to be practical for ML research, with many functions already widely used.

Chat with Tess

58%

Chat with Tess provides an interactive platform for engaging with advanced AI assistants, specifically showcasing the capabilities of Tess-R1 models. These models are designed to produce Chain-of-Thought (CoT) reasoning, enabling them to process complex queries and deliver detailed, structured responses. Users can customize various settings, including the AI model itself and the system message, to tailor their conversational experience. The platform highlights models such as migtissera/Tess-R1-Limerick-Llama-3.1-70B and migtissera/Tess-v2.5.2-Qwen2-72B, offering a hands-on opportunity to explore the Tess-R1 series' advanced reasoning abilities. This tool is ideal for those interested in experimenting with and understanding the nuances of sophisticated AI conversational agents.

Call Annie

58%

Call Annie is an innovative language learning application that leverages AI avatar tutors to provide an immersive and effective learning experience. Users can engage in video calls with these AI tutors to practice and improve their proficiency in English, Spanish, French, German, Japanese, Mandarin, and Korean. The platform focuses on enhancing conversation skills, refining pronunciation with live checks, and expanding vocabulary through interactive play. It also offers personalized learning plans tailored to individual interests and progress. Backed by scientific studies, Call Annie aims to build speaking confidence and accelerate language acquisition for over a million learners worldwide.

CameraTraps

58%

CameraTraps, also known as PyTorch Wildlife, is a collaborative deep learning framework built on PyTorch, specifically designed for conservation efforts. It offers a growing model zoo with pre-trained models like MegaDetector, DeepFaune, and HerdNet for animal detection and classification. The platform is expanding into bioacoustics and overhead animal localization. Recently, the project introduced Sparrow Studio, a unified graphical interface for data management, processing, AI inference, analysis, and annotation, making the powerful AI tools accessible to non-coders. The underlying PW-Engine, written in Rust, provides a model-agnostic inference core with various consumption surfaces, including HTTP REST API, CLI, Python bindings, and a native C library, enabling flexible integration and future MLOps functionality.

rex-gym

58%

rex-gym is an open-source OpenAI Gym environment designed for training quadruped robots, specifically focusing on models like SpotMicro. This tool enables developers and researchers to train robots within simulated environments, aiming to transfer the learned knowledge to physical robots without requiring extensive manual tuning. The project is inspired by the advanced robotics work of Boston Dynamics, providing a platform for experimenting with advanced locomotion and control algorithms. It's ideal for those looking to develop and test AI agents for robotic control in a flexible and accessible simulation setting, fostering innovation in robotics and AI.

CamerAwesome

58%

CamerAwesome is a Flutter plugin designed to simplify the integration of camera functionalities into both Android and iOS applications. It provides a comprehensive and customizable camera experience, allowing developers to embed advanced camera features without extensive native development. Key capabilities include video recording, multi-camera support (beta), live photo filters, exposure level control, and image analysis for tasks like barcode scanning and facial recognition. The plugin offers both a built-in, awesome interface and a custom builder for complete UI control. It supports various camera states and events, enabling dynamic management of the camera flow and media capture events. CamerAwesome is also available as a template in ApparenceKit, making it easier for developers to get started with robust camera features in their Flutter projects.

river

58%

River is an open-source Python library specifically designed for online machine learning, enabling users to process and learn from data streams incrementally. It was formed from the merger of the `creme` and `scikit-multiflow` projects, combining their strengths to offer a comprehensive toolkit for real-time analytics. The library provides a user-friendly interface for implementing various online learning algorithms, making it suitable for applications where data arrives continuously and models need to adapt over time. Key capabilities include handling concept drift, performing incremental model updates, and supporting a wide range of machine learning tasks such as classification, regression, and clustering in a streaming context. River is ideal for developers and data scientists working with dynamic datasets.

bareiron

58%

bareiron is a minimalist Minecraft server specifically engineered for memory-restrictive embedded systems, such as the ESP32. The project's core priorities are memory usage, performance, and features, in that specific order. This focus means that while it provides a functional Minecraft server, compliance with vanilla Minecraft is not guaranteed, nor is it a primary goal. Users can host Minecraft servers on devices that would typically be unable to handle the resource demands of a standard server. The tool supports Minecraft version 1.21.8 and protocol version 772, though it officially supports only the vanilla client. Configuration requires compiling from source, offering options to optimize for performance and memory, including disabling features like chests or fluid flow to prevent instability on resource-constrained hardware. It also supports non-volatile storage for world data persistence on ESP variants.

ScaleCUA

58%

ScaleCUA is an open-source project that introduces a new approach to scaling computer use agents (CUAs) with cross-platform data. It offers a large-scale dataset spanning six operating systems (Windows, macOS, Ubuntu, Android) and three GUI-centric task domains, collected via a closed-loop pipeline combining automated agents and human experts. The project includes ScaleCUA-Models, a general-purpose agent capable of GUI-centric task completion across various environments, and a comprehensive SFT Codebase for training agents based on Qwen2.5-VL and InternVL. Additionally, ScaleCUA provides an interactive playground with realistic environments for Ubuntu, Android, and Web, and an online evaluation suite to benchmark agent capabilities.

Scene-Graph-Benchmark.pytorch

58%

Scene-Graph-Benchmark.pytorch is a comprehensive open-source codebase for Scene Graph Generation (SGG) methods, built on top of the well-known maskrcnn-benchmark project. It serves as a PyTorch implementation of the paper "Unbiased Scene Graph Generation from Biased Training CVPR 2020." The tool allows users to visualize and extract scene graphs from custom images and datasets, supporting various SGG protocols including Predicate Classification (PredCls), Scene Graph Classification (SGCls), and Scene Graph Detection (SGDet). It integrates predefined models like Neural-MOTIFS, IMP, VCTree, and Transformer, and offers options for customizing models. The project also introduces and clarifies SGG metrics, achieving state-of-the-art Recall@k on SGCls & SGGen. It's designed to be novice-friendly and easier to modify than previous frameworks.

cursor-deepseek

58%

cursor-deepseek is a high-performance HTTP/2-enabled proxy server designed to bridge the gap between Cursor IDE's Composer and various powerful language models like DeepSeek, OpenRouter, and Ollama. This proxy translates OpenAI-compatible API requests into the specific API formats required by DeepSeek, OpenRouter, or Ollama, allowing Cursor's Composer and other OpenAI API-compatible tools to seamlessly integrate and utilize these models. Key features include HTTP/2 support for improved performance, full CORS support, streaming responses, function calling/tools, automatic message format conversion, and compression support. It also offers API key validation for secure access and Docker container support for easy deployment.

SiamTrackers

58%

SiamTrackers is a comprehensive collection of PyTorch implementations for deep learning-based visual object tracking algorithms. It encompasses a wide range of models from 2020-2022, including SiamFC, SiamRPN, DaSiamRPN, UpdateNet, SiamDW, SiamRPN++, SiamMask, SiamFC++, SiamCAR, SiamBAN, Ocean, LightTrack, TrTr, and NanoTrack. A key highlight is NanoTrack, designed for lightweight and high-speed performance, suitable for deployment on embedded or mobile devices, capable of running at over 200FPS on Apple M1 CPU. The repository provides PyTorch code for training with lower GPU memory cost and includes Android and MacOS demos based on the ncnn inference framework. It also offers access to various datasets and toolkits for testing and training.

EXPLORE OTHER CATEGORIES

🎨 Content & Design 📊 Productivity & Business 💻 Coding & Development 📚 Research & Education 🧘 Wellness & Lifestyle 💼 Career Development 📈 Marketing & Growth 📉 Data & Analytics 💬 Customer Support & CX 💰 Finance 🛒 E-commerce