🤖

AI Agents & Automation

Browsing page 98 of AI Frameworks & Infra in AI Agents & Automation. Sorted by confidence score — our independent quality rating.

All AI Frameworks & Infra Browser & Web Agents Chatbots & Conversational AI General-Purpose Agents Multi-Agent Systems Personal Assistants RAG & Document AI RPA Scheduling & Task Agents Voice Agents Workflow Agents

resin

60%

Resin is a reboot of an older search engine project, now featuring a more sane architecture. It functions as a vector space search engine, a vector database, and a key/value store, designed for efficient string processing, vector operations, and custom storage primitives. The tool can produce large language models from strings and large 'anything' models from byte arrays. Key features include fast key/value storage with page/column readers and writers, practical text analysis utilities for various data types, and command-line tools for building and validating lexicons. Its design is clean, dependency-light, and easy to extend, making it suitable for developers working with search and machine learning applications.

reasoning-from-scratch

60%

reasoning-from-scratch is the official code repository for the book *Build a Reasoning Model (From Scratch)*, offering a hands-on approach to understanding and implementing reasoning large language models (LLMs) in PyTorch. Users start with a pre-trained base LLM and progressively add reasoning capabilities, mirroring approaches used in large-scale models like DeepSeek R1 and GPT-5 Thinking. The repository includes code for generating text, evaluating reasoning models, improving reasoning with inference-time scaling and self-refinement, and training models with reinforcement learning. It also covers distilling reasoning models for efficiency and provides bonus materials on topics like GPU optimization, advanced evaluation methods, and building chat interfaces. The code is designed to run on consumer hardware, with GPU utilization if available, making it accessible for a wide audience.

smolGPT

60%

smolGPT offers a minimal PyTorch implementation for training small Large Language Models (LLMs) from scratch, designed primarily for educational purposes and simplicity. It boasts a pure PyTorch codebase with no abstraction overhead, incorporating modern architectural elements like Flash Attention (when available), RMSNorm, SwiGLU, and optional Rotary embeddings (RoPE). The tool supports efficient training features including mixed precision (bfloat16/float16), gradient accumulation, learning rate decay with warmup, weight decay, and gradient clipping. It also includes built-in TinyStories dataset processing and SentencePiece tokenizer training integration, making it a comprehensive yet accessible platform for learning LLM development.

Dalton

60%

DaltonTx redefines drug discovery by providing an AI-enabled platform that serves as an intelligence backbone for modern R&D. It offers an adaptive intelligent system that evolves with scientific advancements, integrates seamlessly into existing workflows, and empowers users with lasting capabilities. The platform learns from every scientist, model, and experiment, continuously improving and guiding better decisions. DaltonTx's technology covers the full discovery lifecycle, including data ingestion, model training, molecule generation, and experiment prioritization. It is built by scientists for scientists, combining software engineering, machine learning, and deep drug discovery expertise to tackle complex problems in both small molecules and biologics.

Search-R1

60%

Search-R1 is an open-source reinforcement learning framework designed for training large language models (LLMs) to effectively reason and make tool calls, specifically to search engines, in a coordinated manner. Built upon the veRL framework, it extends the concepts of DeepSeek-R1(-Zero) by integrating interleaved search engine access and offering a comprehensive RL training pipeline. This framework serves as an alternative to OpenAI DeepResearch, fostering research and development in tool-augmented LLM reasoning. It supports various RL methods like PPO, GRPO, and reinforce, accommodates different LLMs such as Llama3 and Qwen2.5, and integrates with diverse search engines including local sparse/dense retrievers and online search engines like Google and Bing.

server

60%

Triton Inference Server is an open-source inference serving software designed to streamline AI inferencing across various environments, including cloud, data centers, edge, and embedded devices. It supports a wide array of deep learning and machine learning frameworks such as TensorRT, PyTorch, ONNX, OpenVINO, and Python. Triton optimizes performance for different query types, including real-time, batched, ensembles, and audio/video streaming. Key features include concurrent model execution, dynamic batching, sequence batching for stateful models, and a Backend API for custom operations. It also provides HTTP/REST and gRPC inference protocols, C and Java APIs for in-process use cases, and metrics for GPU utilization and server latency. Triton is part of NVIDIA AI Enterprise, offering enterprise support.

SPO

60%

SPO (Self-Supervised Prompt Optimization) is an AI tool hosted on Hugging Face Spaces designed to enhance the performance of language models by optimizing user prompts. It allows users to create or select templates, configure various settings, and initiate an optimization process to achieve better responses from AI models. This application is particularly useful for prompt engineers and researchers looking to fine-tune their interactions with large language models, ensuring more accurate and relevant outputs through a self-supervised learning approach. The tool aims to streamline the prompt engineering workflow, making it easier to experiment with and improve prompt effectiveness.

Simd

60%

Simd is a free, open-source C++ image processing and machine learning library designed for C and C++ programmers. It offers a wide array of high-performance algorithms, including pixel format conversion, image scaling and filtration, statistical information extraction, motion detection, object detection, classification, and neural network functionalities. The library is highly optimized, utilizing various SIMD CPU extensions such as SSE, AVX, AVX-512, and AMX for x86/x64, NEON for ARM, and HVX for Hexagon architectures. Simd provides both a C API and C++ classes for ease of access, supporting dynamic and static linking across Windows and Linux with MSVS, G++, and Clang compilers. It also includes a Python wrapper for broader accessibility.

Seed1.5-VL

60%

Seed1.5-VL is a powerful and efficient vision-language foundation model developed by the ByteDance Seed Team. It is engineered to advance general-purpose multimodal understanding and reasoning, demonstrating state-of-the-art performance across numerous public benchmarks. The model features a relatively modest architecture, comprising a 532M vision encoder and a 20B active parameter MoE LLM, yet it excels in complex reasoning tasks, OCR, diagram understanding, visual grounding, 3D spatial understanding, and video comprehension. Seed1.5-VL also shows strong capabilities in interactive agent tasks like GUI control and gameplay, making it versatile for various applications. The project provides a usage cookbook with diverse code samples to help developers effectively leverage its API.

StyleGAN-Human Interpolation

60%

StyleGAN-Human Interpolation is a web-based tool hosted on Hugging Face Spaces, designed for generating and manipulating human faces using AI. It leverages StyleGAN models to create realistic synthetic faces, offering users the ability to explore the capabilities of this advanced generative adversarial network. The primary function of the tool is to produce a series of images that smoothly transition between two distinct, randomly generated human images. Users can control this interpolation process by adjusting parameters such as seed values and truncation psi, which influence the randomness and realism of the generated faces. This makes it a valuable resource for researchers, artists, and enthusiasts interested in AI-driven image synthesis and the nuances of facial generation.

SWE-agent

60%

SWE-agent is an advanced agentic framework designed to enable language models (LMs) like GPT-4o or Claude Sonnet 4 to autonomously identify and fix issues within real GitHub repositories. Beyond software engineering tasks, it can be employed for offensive cybersecurity challenges, such as capture the flag, and competitive coding. The tool is highly configurable, governed by a single YAML file, and offers maximal agency to the LM, making it free-flowing and generalizable. Developed by researchers from Princeton University and Stanford University, SWE-agent has achieved state-of-the-art results on the SWE-bench benchmark. Users can try SWE-agent in their browser or explore its capabilities for offensive cybersecurity through its EnIGMA mode.

swe-rl

60%

SWE-RL is an official codebase for "Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution," designed to scale reinforcement learning-based LLM reasoning for real-world software engineering tasks. It leverages open-source software evolution data and rule-based rewards to improve LLM performance. The codebase includes prompt templates and a flexible reward function API that supports various editing formats, including sequence similarity for search/replace changes and unified diffs. Additionally, SWE-RL features an Agentless Mini component for fast asynchronous inference, code refactoring, file-level localization, and repair, supporting OpenAI-compatible endpoints and Hugging Face models like Llama-3.3-70B-Instruct.

Deix S.r.l.

60%

Deix S.r.l. specializes in developing innovative algorithms and applications by leveraging expertise in mathematical modeling, artificial intelligence, and optimization. They provide solutions that enable companies to make informed decisions and identify new business opportunities. Deix offers both ready-to-use products and tailor-made solutions designed to meet specific business needs. Their approach integrates internal knowledge and data to deliver high-quality, efficient results, as evidenced by client testimonials highlighting speed, technical expertise, and proactivity in solving complex challenges.

sqlite-vss

60%

sqlite-vss is a SQLite extension designed to bring vector search capabilities directly into SQLite databases, leveraging the Faiss library for efficiency. It enables developers to build semantic search engines, recommendation systems, and question-and-answering tools by storing and querying vector embeddings. While not actively developed, with efforts now focused on sqlite-vec, it offers a robust solution for integrating vector search into applications using SQLite. Users can create virtual tables to store high-dimensional embeddings and perform k-nearest neighbor searches. It supports various languages through bindings like Python, Node.js, Deno, Ruby, Elixir, Go, and Rust, making it accessible to a wide range of developers.

Falcondale

60%

Falcondale specializes in developing applied quantum machine learning and optimization solutions designed to deliver real-world impact. The company focuses on leveraging quantum intelligence to solve complex problems across various industries. Falcondale aims to provide a competitive edge through its advanced quantum technologies, offering solutions that go beyond traditional computational methods. Their expertise lies in translating cutting-edge quantum research into practical, deployable applications for businesses and organizations seeking innovative data analysis and optimization capabilities.

streaming-llm

60%

StreamingLLM is an innovative open-source framework designed to address the challenges of deploying Large Language Models (LLMs) in streaming applications that require processing infinite-length inputs. It introduces the concept of "attention sinks" to efficiently manage Key and Value (KV) states, allowing LLMs to generalize to infinite sequence lengths without fine-tuning. This approach prevents the performance degradation seen in traditional window attention methods when text length exceeds cache size. StreamingLLM enables models like Llama-2, MPT, Falcon, and Pythia to perform stable and efficient language modeling with millions of tokens, offering up to a 22.2x speedup over sliding window recomputation baselines. It is particularly optimized for scenarios such as multi-round dialogues where continuous operation without extensive memory or dependency on past data is crucial.

streaming-vlm

60%

StreamingVLM is an innovative AI tool designed for real-time understanding of effectively infinite video streams. Developed by mit-han-lab, it addresses common challenges in long-video analysis by maintaining a compact KV cache and aligning training directly with streaming inference. This approach efficiently avoids the quadratic cost associated with traditional methods and mitigates the pitfalls of sliding-window techniques. The system is capable of running at up to 8 frames per second (FPS) on a single H100 GPU, offering stable and efficient video processing. It has demonstrated superior performance, winning 66.18% against GPT-4o mini on a new long-video benchmark and also enhances general Video Question Answering (VQA) capabilities without requiring task-specific fine-tuning. The project provides scripts for environment setup, inference, supervised fine-tuning (SFT), and various evaluations including OVOBench and VQA tasks.

TheAgentCompany

60%

TheAgentCompany is an open-source benchmark designed to evaluate the performance of LLM agents on consequential, real-world tasks within a simulated software company environment. It allows for assessing how well AI agents can accelerate or autonomously perform work-related tasks by interacting with the web, writing code, running programs, and communicating. The platform offers diverse task roles, data types, and a comprehensive scoring system with multiple evaluation methods, including deterministic and LLM-based evaluators. It features simple one-command operations for environment setup and quick system resets, making it an extensible framework for adding new tasks and evaluators. The benchmark is available on GitHub and supports integration with platforms like OpenHands.

textgenrnn

60%

textgenrnn is a Python 3 module built on Keras/TensorFlow designed for creating character-level recurrent neural networks (char-RNNs). It enables users to easily train text-generating neural networks of any size and complexity on any text dataset. The tool incorporates modern neural network architectures, including attention-weighting and skip-embedding, to accelerate training and enhance model quality. Users can train and generate text at either the character or word level, configure RNN size, layer count, and use bidirectional RNNs. It supports training on generic input text files, including large ones, and allows for GPU-trained models to generate text on a CPU. Additionally, textgenrnn offers a powerful CuDNN implementation for faster GPU training and supports contextual labels for improved learning and results.

TileRT

60%

TileRT is an open-source, tile-based runtime engineered for ultra-low-latency Large Language Model (LLM) inference. It aims to push the boundaries of LLM latency without compromising model size or quality, allowing models with hundreds of billions of parameters to achieve millisecond-level time per output token (TPOT). Unlike traditional inference systems optimized for high-throughput batch processing, TileRT prioritizes responsiveness, making it ideal for applications like high-frequency trading, interactive AI, real-time decision-making, and AI-assisted coding. It achieves this by decomposing LLM operators into fine-grained tile-level tasks and dynamically rescheduling computation, I/O, and communication across multiple devices to minimize idle time and improve hardware utilization. TileRT currently supports models like GLM-5 and DeepSeek-V3.2 and offers Multi-Token Prediction (MTP) for efficient longer output generation.

tokenizers

60%

tokenizers is an open-source library developed by Hugging Face, offering highly optimized and versatile tokenizers for natural language processing tasks. Implemented primarily in Rust, it boasts exceptional performance, capable of tokenizing a gigabyte of text on a server's CPU in less than 20 seconds. The library supports training new vocabularies and tokenizing text using popular models like Byte-Pair Encoding, WordPiece, and Unigram. It includes features such as alignment tracking during normalization, ensuring that the original sentence segments corresponding to tokens can always be retrieved. Additionally, it handles pre-processing steps like truncation, padding, and adding special tokens required by various models, making it suitable for both research and production environments.

trajectory-transformer

60%

Trajectory Transformer is an open-source code release that implements offline reinforcement learning as a sequence modeling problem. Based on the paper "Offline Reinforcement Learning as One Big Sequence Modeling Problem," this tool provides a framework for training models to predict trajectories. It includes scripts for training transformers on various datasets and for planning with these models. The project also offers pretrained models for multiple datasets, allowing users to quickly experiment and reproduce results. It supports installation via conda or Docker, and provides utilities for running jobs on Azure, making it suitable for researchers and engineers in reinforcement learning and robotics.

TASO

60%

TASO, the Tensor Algebra SuperOptimizer for Deep Learning, significantly enhances the performance of deep neural network models. It achieves this by automatically generating and verifying graph transformations to build a vast search space of computation graphs equivalent to the original DNN model. Employing a cost-based search algorithm, TASO discovers highly optimized computation graphs, leading to up to a 3x performance improvement over graph optimizers in current deep learning frameworks. It supports optimizing pre-trained models in ONNX, TensorFlow, and PyTorch formats, and offers a Python interface for arbitrary DNN architectures. Optimized graphs can be exported to ONNX for use in existing deep learning frameworks, maintaining original model accuracy.

texar

60%

Texar is a comprehensive toolkit designed to support a broad range of machine learning tasks, with a particular focus on natural language processing and text generation. Built on TensorFlow, it offers a rich library of modular and easy-to-use ML components and functionalities, enabling both researchers and practitioners to rapidly prototype and experiment with models. Key features include support for pre-trained models like BERT, GPT2, and XLNet, and full customizability at multiple abstraction levels. Texar is versatile, supporting various tasks, models, algorithms, data processing, and evaluation methods, from encoder-decoder architectures to reinforcement learning and adversarial learning. It emphasizes modularity for maximum re-use and clean APIs, based on a principled decomposition of learning, inference, and model architecture. The toolkit also supports distributed model training with multiple GPUs and provides extensive documentation and examples.

EXPLORE OTHER CATEGORIES

🎨 Content & Design 📊 Productivity & Business 💻 Coding & Development 📚 Research & Education 🧘 Wellness & Lifestyle 💼 Career Development 📈 Marketing & Growth 📉 Data & Analytics 💬 Customer Support & CX 💰 Finance 🛒 E-commerce