AI Agents & Automation
Browsing page 97 of AI Frameworks & Infra in AI Agents & Automation. Sorted by confidence score — our independent quality rating.
graphrag-local-ollama
GraphRAG Local Ollama is an open-source adaptation of Microsoft's GraphRAG, designed to leverage local models via Ollama for LLM and embedding extraction. This tool eliminates the dependency on costly OpenAPI models, offering a cost-effective solution for knowledge graph implementations. It supports a variety of local models such as Llama3, Mistral, Gemma2, and Phi3, and integrates with Ollama for both language models and embedding models like nomic-embed-text. The setup process is straightforward, involving conda environment creation, Ollama installation, repository cloning, and specific `pip install` commands. Users can easily configure models and run indexing and querying operations, with options to visualize generated graphs using tools like Gephi or a provided Python script.
guidellm
Guidellm is an open-source platform designed for evaluating and enhancing Large Language Model (LLM) deployments, focusing on real-world inference needs. It simulates end-to-end interactions with OpenAI-compatible and vLLM-native servers, generating workload patterns that reflect production usage. The platform produces detailed reports to help teams understand system behavior, resource needs, and operational limits. Guidellm supports both real and synthetic multimodal datasets, including text, image, audio, and video inputs, and offers flexible execution profiles. It provides SLO-aware benchmarking, capturing complete latency and token-level statistics for metrics like TTFT, ITL, and end-to-end behavior, ensuring consistent assessment of model performance, tuning deployments, and capacity planning.
gym-pybullet-drones
gym-pybullet-drones offers PyBullet Gymnasium environments specifically designed for single and multi-agent reinforcement learning in quadcopter control. This tool is a minimalist refactoring of its original repository, ensuring compatibility with Gymnasium, Stable-Baselines3 2.0, and Betaflight/Crazyflie-firmware SITL. It provides examples for PID control, downwash effect simulation, and reinforcement learning using SB3's PPO algorithm. Researchers and developers can use this environment to train and test control policies for drones, facilitating advancements in robotics and autonomous systems. The project also includes examples for integrating with Betaflight SITL and pycffirmware Python bindings.
infiAgent
infiAgent, also known as MLA (Multi-Level Agent), is an open-source agent framework designed for handling long-running, complex tasks without issues like tool calling chaos or system crashes due to cumulative task resources and conversation history. It enables users to build powerful general-purpose and semi-specialized agents by simply editing configuration files. Key features include support for days-long complex tasks with full recovery from interruptions, compatibility with the Agent Skills open standard for dynamic skill loading, and a flexible architecture supporting both multi-level hierarchy and flat designs. The framework utilizes a file-directory-based memory system for persistent memory across sessions, eliminating the need for external databases. It also offers a Docker-based Web UI for multi-user registration and account management, and supports multi-provider model configurations for fine-grained cost control.
jvector
JVector is an advanced embedded vector search engine that tackles the challenges of exact nearest neighbor search in high-dimensional spaces, a problem known as the “curse of dimensionality.” It focuses on approximate nearest neighbor (ANN) search, offering a more efficient solution for large datasets. JVector is a graph-based index that combines the hierarchical structure of HNSW with the Vamana algorithm (from DiskANN) within each layer. Its architecture supports multi-layer graphs with nonblocking concurrency, allowing linear scaling with the number of cores. It also features a two-pass search design using lossily compressed representations for the first pass (PQ, BQ, Fused PQ) and more accurate representations for the second (Full resolution float32, NVQ), reducing memory usage and latency while preserving accuracy. JVector also uniquely allows for building larger-than-memory indexes using two-pass searches.
KAG
KAG is an open-source logical form-guided reasoning and retrieval framework built upon the OpenSPG engine and large language models (LLMs). It specializes in creating logical reasoning and factual Q&A solutions for professional domain knowledge bases, effectively addressing the limitations of traditional RAG vector similarity calculations and GraphRAG noise. KAG supports logical reasoning and multi-hop factual Q&A, offering superior performance compared to current state-of-the-art methods. Its core features include knowledge and chunk mutual indexing, conceptual semantic reasoning for knowledge alignment, schema-constrained knowledge construction, and logical form-guided hybrid reasoning and retrieval.
jido
Jido is an autonomous agent framework specifically designed for Elixir, facilitating the development of distributed, autonomous behavior and dynamic workflows. It allows users to define agents, connect them to actions, signals, and directives, and run them with built-in supervision and fault tolerance. The framework supports building agent systems as ordinary Elixir and OTP software, where agents hold state and implement commands, actions transform state, signals route events, and directives describe effects for the runtime. Jido is particularly useful for software that needs to inspect context, choose among multiple steps, coordinate with other agents, and maintain reliable operation over time. While AI integration is optional, companion packages like `jido_ai` provide model integration when needed, making it a flexible solution for complex multi-agent orchestrations.
ir-sim
ir-sim is an open-source, Python-based lightweight robot simulator specifically designed for navigation, control, and learning applications. It offers a simple and user-friendly framework that includes built-in collision detection, making it ideal for academic and educational use. The simulator allows for rapid prototyping of robotics and learning algorithms in custom scenarios with minimal coding and hardware requirements. Key features include the ability to simulate various robot platforms with diverse kinematics and sensors, quick scenario configuration using straightforward YAML files, and visualization of simulation outcomes with a naive visualizer for immediate debugging. It also supports multi-agent/robot learning projects.
IsaacLab
Isaac Lab is a GPU-accelerated, open-source framework designed to unify and simplify robotics research workflows, including reinforcement learning, imitation learning, and motion planning. Built on NVIDIA Isaac Sim, it combines fast and accurate physics and sensor simulation, making it an ideal choice for sim-to-real transfer in robotics. The framework provides developers with essential features for accurate sensor simulation, such as RTX-based cameras, LIDAR, and contact sensors. Its GPU acceleration enables faster complex simulations and computations, crucial for iterative processes like reinforcement learning. Isaac Lab supports over 16 robot models and more than 30 ready-to-train environments, compatible with popular reinforcement learning frameworks like RSL RL, SKRL, RL Games, and Stable Baselines. It can run locally or be distributed across the cloud, offering flexibility for large-scale deployments.
ix
ix is an autonomous GPT-4 agent platform designed for building and deploying AI-powered agents and workflows. It offers a flexible and scalable solution for delegating tasks to AI agents, enabling them to automate a wide variety of tasks, run in parallel, and communicate with each other. Key features include a no-code agent editor for creating and testing agents with a visual graph interface, a multi-agent chat interface for interacting with teams of agents, and smart input with auto-completion. The platform supports various models like OpenAI, Google PaLM, Anthropic, and Llama. Its backend is dockerized and uses a Celery message queue for horizontal scaling of agent workers, making it suitable for complex and demanding AI applications.
instill-core
Instill Core is a full-stack, open-source AI infrastructure tool designed for comprehensive data, model, and pipeline orchestration. It simplifies the complexities of building AI-first applications by offering ETL processing, AI-readiness, and capabilities for hosting open-source LLMs and RAG. The platform features a Pipeline builder for creating AI-first APIs and automated workflows, Components for connecting essential building blocks, and Artifact management to transform unstructured data into AI-ready formats. Instill Core also supports deploying and monitoring AI models without requiring extensive GPU infrastructure, making it accessible for various AI development needs. It provides client access via Console, CLI, and SDKs (Python, TypeScript).
LlamaGym
LlamaGym is an open-source framework designed to simplify the fine-tuning of Large Language Model (LLM) agents using online reinforcement learning. Unlike many current LLM-based agents that do not learn continuously in real-time, LlamaGym enables agents to interact with an environment and receive immediate reward signals for ongoing learning. It addresses common challenges such as managing LLM conversation context, handling episode batches, assigning rewards, and setting up Proximal Policy Optimization (PPO). By providing a single abstract Agent class, LlamaGym allows developers to quickly iterate and experiment with agent prompting and hyperparameters across various Gym environments, making the process of integrating LLMs with RL more accessible. While currently a work in progress, it aims to streamline the development of adaptive LLM agents.
LabelLLM
LabelLLM is an innovative, open-source platform dedicated to optimizing the data annotation process crucial for Large Language Model (LLM) development. It is engineered to be a powerful tool for independent developers and small to medium-sized research teams, significantly improving annotation efficiency. The platform provides comprehensive task management solutions, offering real-time monitoring of annotation progress and quality control to ensure data integrity. LabelLLM supports a wide range of data modalities, including audio, images, and video, allowing for complex annotation projects on a single unified platform. Its flexible framework includes customizable task-specific tools and AI-assisted annotation features like pre-annotation loading, which users can refine for enhanced accuracy and efficiency.
LLaVA
LLaVA (Large Language and Vision Assistant) is an open-source project focused on visual instruction tuning to develop large language and vision models with capabilities comparable to GPT-4. It offers improved baselines and supports community contributions, making it a robust platform for multimodal AI research and development. Recent releases include LLaVA-NeXT models with support for LLaMA-3 and Qwen-1.5, LLaVA-NeXT (Video) for zero-shot modality transfer, and LMMs-Eval for efficient evaluation of Large Multimodal Models. The project also provides LLaVA-Plus for multimodal agents and LLaVA-Interactive for human-AI multimodal interaction, including image chat, segmentation, generation, and editing. LLaVA supports LoRA finetuning for reduced GPU RAM and offers various model checkpoints through its Model Zoo.
machinelearning-samples
machinelearning-samples is a GitHub repository offering a comprehensive collection of samples for ML.NET, an open-source and cross-platform machine learning framework designed for .NET developers. The repository aims to make machine learning accessible by providing practical examples for various ML tasks, including binary classification, multi-class classification, recommendation, regression, anomaly detection, clustering, ranking, and computer vision. It features both getting started code-focused samples and end-to-end applications, such as web and desktop apps infused with ML.NET models. Additionally, it includes samples for automating ML.NET model generation through CLI and AutoML APIs, simplifying the process of creating high-quality models without extensive manual coding.
Long-Context
Long-Context is an open-source repository from Abacus.AI designed to provide code and tooling for Large Language Model (LLM) context expansion. It offers a comprehensive suite of evaluation scripts and benchmark tasks specifically tailored to assess a model’s information retrieval capabilities within expanded contexts. The repository details various experimental results, including different positional encoding schemes like linear scaling and fine-tuning approaches, and provides instructions for reproducing and building upon these findings. It also shares weights for best-performing models, such as the scale 16 model, which is expected to perform well up to 16k context lengths. The project includes novel evaluation datasets like an extended LMSys dataset and WikiQA (Free Form QA and Altered Numeric QA) to rigorously test models across varying context lengths and answer locations, addressing potential issues like models answering from pre-trained knowledge rather than provided context.
MiniGPT4-video
MiniGPT4-video offers official code for the Goldfish model, designed for understanding arbitrarily long videos, and MiniGPT4-video itself, tailored for short video understanding. This tool advances multimodal Large Language Models (LLMs) by integrating visual and textual tokens for comprehensive video analysis. Goldfish addresses challenges in long video processing through an efficient retrieval mechanism that identifies relevant video clips, making it suitable for applications like movies or TV series. MiniGPT4-video generates detailed descriptions for video clips, facilitating the retrieval process for Goldfish. The project also introduces the TVQA-long benchmark for evaluating long video comprehension and demonstrates significant performance improvements over existing state-of-the-art methods in both long and short video understanding.
ml-cvnets
ml-cvnets is a comprehensive computer vision toolkit developed by Apple, designed for researchers and engineers to efficiently train a wide array of computer vision models. It supports both standard and novel mobile- and non-mobile architectures for tasks such as object classification, object detection, semantic segmentation, and foundation models like CLIP. The library is built on Python 3.10+ and PyTorch, offering features like automatic data augmentation (RangeAugment, AutoAugment, RandAugment) and enhanced distillation support. It includes a model zoo with various CNNs (MobileNet, EfficientNet, ResNet) and Transformers (Vision Transformer, MobileViT, SwinTransformer), making it a versatile platform for advanced computer vision research and development.
nerve
Nerve is a powerful Agent Development Kit (ADK) designed for technical users to build, run, evaluate, and orchestrate LLM-based agents. It simplifies agent creation through a declarative YAML format, allowing definition of system prompts, tasks, tools, and variables in a single file. The kit supports various tools, including shell commands, Python functions, and remote tools, all fully typed and annotated for extensibility. A key differentiator is its native Model Context Protocol (MCP) support, enabling the definition of MCP servers in YAML and acting as both client and server for agent teams and deep orchestration. Nerve also includes an evaluation mode for benchmarking agents with reproducible tests and an LLM-agnostic architecture built on LiteLLM, supporting numerous models like OpenAI, Anthropic, and Ollama.
ncnn
ncnn is a high-performance neural network inference computing framework specifically optimized for mobile platforms. Designed from the ground up with mobile deployment in mind, it boasts no third-party dependencies, ensuring cross-platform compatibility and superior speed on mobile CPU compared to other known open-source frameworks. Developers can leverage ncnn to easily port deep learning algorithms to mobile devices, facilitating the creation of intelligent applications and bringing AI capabilities to users' fingertips. It supports a wide array of convolutional neural networks, including classical, practical, and light-weight architectures, as well as models for detection, segmentation, and pose estimation. ncnn also features ARM NEON assembly-level optimization, sophisticated memory management, multi-core parallel computing, and GPU acceleration via Vulkan API, making it a robust solution for mobile AI.
AI Singapore
AI Singapore is a national program launched in May 2017, dedicated to fostering advanced AI capabilities within Singapore. It serves as a nexus for Singapore-based research institutions, AI startups, and established companies, facilitating collaborative efforts in use-inspired research, knowledge creation, tool development, and talent cultivation. The initiative focuses on key areas such as AI Research, Governance, Technology, Innovation, and Products, aiming to generate significant social and economic impact. It also offers various talent development programs, including the AI Apprenticeship Programme (AIAP) and LearnAI, to equip professionals and students with essential AI skills.
pezzo
Pezzo is an open-source, developer-first LLMOps platform that provides comprehensive tools for managing and optimizing AI operations. It streamlines prompt design, offering version management and instant delivery capabilities. The platform facilitates collaboration among developers and includes robust features for troubleshooting and observability, allowing users to monitor their AI operations effectively. Pezzo aims to significantly reduce costs and latency associated with AI deployments, making it an ideal solution for developers looking to enhance their LLM workflows. It supports various clients including Node.js, Python, and LangChain, and integrates with open-source technologies like PostgreSQL, ClickHouse, Redis, and Supertokens.
pytorch-pruning
pytorch-pruning is an open-source PyTorch implementation of the paper "Pruning Convolutional Neural Networks for Resource Efficient Inference." This tool is designed to optimize deep learning models by reducing their size and improving inference speed. It achieves this by systematically removing filters from convolutional layers. The project demonstrates its effectiveness by pruning a VGG16-based classifier on a small dog/cat dataset, resulting in a significant 3x reduction in CPU runtime and a 4x reduction in model size. While currently pruning filters sequentially, the project notes that future improvements could include a single-pass pruning mechanism for greater efficiency. It also aims to support additional architectures beyond VGG, such as VGG with batch normalization.
ppl.nn
PPLNN, short for "Primitive Library for Neural Network," is a high-performance deep-learning inference engine designed for efficient AI inferencing. It supports running various ONNX models and offers enhanced compatibility with OpenMMLab. Key features include a new LLM Engine with Flash Attention, Group-query Attention, and Dynamic Batching, alongside Tensor Parallelism and Graph Optimization. It also supports INT8 groupwise KV Cache and INT8 per token per channel Quantization for improved performance and accuracy. The library provides comprehensive documentation for building from source, integrating APIs, and developing new engines and operations across X86, CUDA, RISCV, and ARM platforms. It is an open-source project, welcoming contributions and providing resources for developers.