ShypdShypd.ai
💻

Coding & Development

Browsing page 56 of AI tools for Open Source & Models in Coding & Development. Sorted by confidence score — our independent quality rating.

ktransformers

ktransformers

61%

KTransformers is an open-source research project focused on efficient inference and fine-tuning of large language models (LLMs) through CPU-GPU heterogeneous computing. It comprises two core modules: kt-kernel for high-performance inference kernels and kt-sft for a fine-tuning framework. kt-kernel offers CPU-optimized operations with AMX/AVX acceleration, MoE optimization, and quantization support (INT4/INT8 CPU, GPTQ GPU), with easy integration via Python API. kt-sft integrates with LLaMA-Factory for resource-efficient fine-tuning of ultra-large MoE models, supporting LoRA and production-ready features like chat and batch inference. The framework is designed for researchers and engineers working to optimize LLM performance on diverse hardware configurations.

Agent-First-Organization

Agent-First-Organization

61%

Agent-First-Organization is the official Python library for the Arklex framework, designed for building, deploying, and scaling intelligent AI agents with enterprise-grade reliability. It features an agent-first design purpose-built for multi-agent orchestration and is model agnostic, supporting OpenAI, Anthropic, Gemini, and more. The framework includes built-in evaluation capabilities, enterprise security features like authentication and rate limiting, and is production-ready with monitoring, logging, and auto-scaling. Key components include a declarative Task Graph, an Orchestrator for runtime and state management, and various Workers (RAG, database, web automation) and Tools (Shopify, HubSpot, Google Calendar integrations).

LLMRouter

LLMRouter

61%

LLMRouter is an intelligent open-source library designed to optimize Large Language Model (LLM) inference by dynamically selecting the most suitable model for each query. It achieves smart routing based on task complexity, cost, and performance requirements. The library supports over 16 routing models, categorized into single-round, multi-round, agentic, and personalized routers, covering diverse strategies like KNN, SVM, MLP, and graph-based routing. It provides a unified command-line interface (CLI) for training, inference, and interactive chat with a Gradio-based UI. Additionally, LLMRouter includes a comprehensive data generation pipeline for creating training data from 11 benchmark datasets, complete with automatic API calling and evaluation. It also supports multimodal understanding (image/audio/video) and integration with OpenAI-compatible servers like OpenClaw for production deployment.

long_llama

long_llama

61%

LongLLaMA is a large language model specifically designed to manage and process exceptionally long contexts, up to 256k tokens or more. Built upon the OpenLLaMA foundation and enhanced with the innovative Focused Transformer (FoT) method, it allows language models to handle extensive inputs while training on shorter sequences. The FoT method uses contrastive learning to enable attention layers to access a memory cache, significantly extending the effective context length. LongLLaMA is available in several variants, including a 3B base model under an Apache 2.0 license, and instruction-tuned versions like LongLLaMA-Instruct-3Bv1.1. A LongLLaMA Code 7B model, based on Code Llama, is also provided for code-related tasks. The project offers inference code, instruction tuning, and FoT continued pretraining code, making it a valuable resource for researchers and developers working with large language models and context scaling.

llumnix

llumnix

61%

Llumnix is an open-source project designed for efficient and easy multi-instance Large Language Model (LLM) serving. It acts as a cross-instance request scheduling layer built on top of LLM inference engines like vLLM, aiming to optimize multi-instance serving performance. Key benefits include low latency through reduced time-to-first-token (TTFT) and queuing delays, high throughput via integration with state-of-the-art inference engines, and support for techniques like prefill-decode disaggregation. Llumnix achieves this through dynamic, fine-grained, KV-cache-aware scheduling and continuous rescheduling across instances, enabled by a near-zero overhead KV cache migration mechanism. It is easy to use, requiring minimal code changes for vanilla vLLM deployments, and offers seamless integration with existing multi-instance deployment platforms, fault tolerance, elasticity, and high service availability.

local-ai-stack

local-ai-stack

61%

local-ai-stack is a comprehensive starter kit designed for developers to build and deploy local-only AI applications, eliminating the need for cloud services and associated costs. It focuses on privacy and offline capabilities, starting with document Q&A functionalities. The stack integrates key technologies such as Ollama for inference, Supabase pgvector for vector database management, and Langchain.js for LLM orchestration. The application logic is built with Next.js, and embeddings are generated using Transformer.js and all-MiniLM-L6-v2. This kit is ideal for those looking to develop AI solutions that run entirely on local infrastructure, offering a cost-effective and privacy-focused approach to AI development.

miniDiffusion

miniDiffusion

61%

miniDiffusion is a reimplementation of the Stable Diffusion 3.5 model, built entirely in pure PyTorch with a focus on minimal dependencies. This tool is specifically designed for educational, experimental, and hacking purposes, aiming to recreate Stable Diffusion 3.5 from scratch with the least amount of code necessary. The project encompasses approximately 2800 lines of code, covering components from VAE to DiT, as well as training and dataset scripts. Key features include implementations of VAE, CLIP, and T5 Text Encoders, Byte-Pair & Unigram tokenizers, the Multi-Modal Diffusion Transformer Model, Flow-Matching Euler Scheduler, Logit-Normal Sampling, and Joint Attention. It also provides scripts for training and inference for SD3.

mflux

mflux

61%

mflux is an open-source tool designed for running state-of-the-art generative image models natively on Apple Silicon Macs using the MLX framework. It offers line-by-line MLX ports of models from Huggingface Diffusers and Transformers libraries, focusing on a minimal and explicit implementation. Users can generate images via a command-line interface or Python API, with features like quantization, local model loading, and LoRA support. The tool supports various models including Z-Image, FLUX.2, FIBO, SeedVR2, Qwen Image, and Depth Pro, each with unique strengths in areas like speed, quality, prompt understanding, and upscaling. It also includes advanced capabilities such as text-to-image, image-to-image, LoRA finetuning, in-context editing, ControlNet, depth conditioning, and inpainting.

Multimodal-Toolkit

Multimodal-Toolkit

61%

Multimodal-Toolkit is an open-source toolkit designed for integrating multimodal data, specifically text and tabular data, for classification and regression tasks. It leverages HuggingFace transformers as the foundational model for processing text features. The toolkit introduces a combining module that integrates outputs from the transformer with categorical and numerical features, generating rich multimodal features for downstream machine learning layers. This approach allows for the training of the combining module and transformer parameters based on supervised tasks. It supports various Hugging Face Transformers like BERT, ALBERT, DistilBERT, and RoBERTa, and includes methods for combining features such as concatenation, MLPs, and attention mechanisms. The toolkit also provides example datasets and working examples for quick implementation.

pgvectorscale

pgvectorscale

61%

pgvectorscale is a PostgreSQL extension designed to significantly boost vector search performance and provide cost-efficient storage for AI applications, building upon the capabilities of pgvector. It introduces key innovations such as StreamingDiskANN, an index type inspired by Microsoft's research, and Statistical Binary Quantization developed by Timescale for improved data compression. The tool also supports label-based filtered vector search, allowing for more precise and efficient results by combining vector similarity with label filtering. Benchmarks show pgvectorscale achieving substantially lower latency and higher query throughput compared to other solutions, all at a reduced cost when self-hosted. Developed in Rust using the PGRX framework, it offers a new avenue for community contributions to PostgreSQL's vector support.

AI71

AI71

61%

AI71 is an applied research team that creates AI solutions tailored for enterprises and governments globally. Their offerings include a suite of products such as Ask, which provides superhuman capabilities for tasks like finding answers in documents and automating HR, and SuperHive, an intelligence platform for construction with features like CAD/BIM validation and delay forecasting. They also offer Health, an automated revenue cycle solution for healthcare. Beyond products, AI71 provides QBrain advisory, combining strategic insight with technical expertise to ensure successful AI transformation and measurable impact for their partners.

physicsnemo

physicsnemo

61%

NVIDIA PhysicsNeMo is an open-source deep-learning framework designed for building, training, fine-tuning, and inferring Physics AI models using state-of-the-art SciML methods. It provides Python modules to compose scalable and optimized training and inference pipelines, enabling real-time predictions by combining physics knowledge with data. The framework supports various model architectures like neural operators, GNNs, and transformers, and is optimized for NVIDIA GPUs, offering efficient scaling from single to multi-node GPU clusters. PhysicsNeMo is built on PyTorch, ensuring a familiar experience for users, and is highly extensible for customization and integration into existing workflows. It includes modules for models, data pipelines, distributed computing, data curation, and symbolic geometry/PDEs.

Prophecis

Prophecis

61%

Prophecis is a comprehensive, one-stop cloud-native machine learning platform developed by WeBank. It integrates various open-source machine learning frameworks and offers robust multi-tenant management capabilities for machine learning compute clusters. The platform provides full-stack container deployment and management services for production environments, supporting the entire machine learning lifecycle from data preprocessing and feature engineering to model training, evaluation, release, and deployment. Key components include Prophecis Machine Learning Flow for distributed modeling, MLLabis for development and exploration with Jupyter Lab integration, Model Factory for model storage and deployment, Data Factory for feature engineering, and Application Factory for CI/CD and DevOps tools.

pixeltable

pixeltable

61%

pixeltable is an open-source Python library designed to provide declarative, transactional data infrastructure for building multimodal AI applications. It offers incremental storage, transformation, indexing, retrieval, and orchestration of data, ensuring full operational integrity. The tool bundles its own transactional database, orchestration engine, and a local dashboard, requiring only a `pip install` for setup without external services like Docker. It supports various media types including images, video, audio, and documents, and integrates with over 30 AI providers like OpenAI, Anthropic, and Gemini. Key features include declarative computed columns for automated processing, built-in vector search for embedding indexes, and robust version control for data persistence and time travel, making it suitable for both prototyping and production AI workflows.

pipeshub-ai

pipeshub-ai

61%

PipesHub is a fully extensible and explainable workplace AI platform designed for enterprise search and workflow automation. It addresses the challenge of scattered work data across various applications like Google Workspace, Microsoft 365, Slack, Jira, and Confluence by providing a natural language search interface. Users can quickly find information, get answers, and gain insights, with results properly cited using Knowledge Graphs and Page Ranking. Beyond search, PipesHub offers a No-Code interface for enterprises to build custom applications and AI agents. It supports flexible model integration, real-time or scheduled indexing, access-driven visibility, and secure deployments both on-premise and in the cloud.

RLinf

RLinf

61%

RLinf is a flexible and scalable open-source reinforcement learning (RL) infrastructure specifically designed for Embodied and Agentic AI. It acts as a robust backbone for next-generation training, supporting open-ended learning, continuous generalization, and limitless possibilities in intelligence development. The platform offers high flexibility for diverse RL training workflows, including PPO, GRPO, and SAC, while abstracting the complexities of distributed programming. Users can easily scale RL training across numerous GPU nodes without code modification. RLinf integrates with multiple backends like FSDP, HuggingFace, SGLang, vLLM, and Megatron, catering to both rapid prototyping and large-scale, efficient training. It supports a wide array of embodied AI simulators, VLA models, world models, and real-world robotics data collection, making it a comprehensive solution for advanced RL research and development.

Qwen3-VL

Qwen3-VL

61%

Qwen3-VL is a multimodal large language model series developed by the Qwen team at Alibaba Cloud. This advanced model offers significant enhancements in text understanding and generation, visual perception and reasoning, extended context length, and improved spatial and video dynamics comprehension. It also features stronger agent interaction capabilities, including operating PC/mobile GUIs and generating code from images/videos. Available in Dense and MoE architectures, Qwen3-VL supports flexible deployment from edge to cloud, with Instruct and reasoning-enhanced Thinking editions. Key features include advanced spatial perception, long context and video understanding, enhanced multimodal reasoning for STEM/Math, upgraded visual recognition, and expanded OCR supporting 32 languages.

RoboticsDiffusionTransformer

RoboticsDiffusionTransformer

61%

RoboticsDiffusionTransformer (RDT-1B) is a 1-billion parameter diffusion foundation model specifically designed for bimanual robotic manipulation. It is pre-trained on an extensive dataset of over 1 million multi-robot episodes, making it the largest to date. RDT-1B can predict the next 64 robot actions based on language instructions and RGB images from up to three views. The model is compatible with various modern mobile manipulators, supporting single-arm to dual-arm configurations, joint to EEF control, and position to velocity commands, including wheeled locomotion. This repository provides the official PyTorch implementation, including model checkpoints, training and sampling scripts, and an example for real-robot deployment on the ALOHA dual-arm robot, where it has achieved state-of-the-art performance in dexterity, zero-shot generalizability, and few-shot learning.

SINQ

SINQ

61%

SINQ (Sinkhorn-Normalized Quantization) is a novel, fast, and high-quality quantization method designed to make any Large Language Model smaller while preserving accuracy. It allows users to deploy models that would otherwise be too large, drastically reducing memory usage. SINQ offers both calibration-free (SINQ) and calibrated (A-SINQ) versions, providing state-of-the-art performance. It is integrated into Hugging Face Transformers for simplified use and supports saving and reloading quantized models. SINQ boasts significantly faster quantization speeds compared to alternatives like HQQ and AWQ, making it an efficient solution for LLM optimization.

solace-agent-mesh

solace-agent-mesh

61%

Solace Agent Mesh is an open-source, event-driven framework designed to build and orchestrate multi-agent AI systems. It allows developers to create teams of specialized AI agents, each with distinct skills and access to specific tools, such as database agents or multimodal agents. The framework handles communication between agents automatically, leveraging the Solace Platform for true scalability and reliability. Built on the Solace AI Connector (SAC) and Google's Agent Development Kit (ADK), it provides a fully asynchronous, event-driven, and decoupled AI agent architecture ready for production deployment. Key features include multi-agent event-driven architecture, agent orchestration, flexible interfaces, and dynamic embeds for context-dependent information resolution.

chart-gpt

chart-gpt

61%

chart-gpt is an open-source AI tool designed to build charts quickly and efficiently from text input. Users can clone the repository, set up their PaLM API key, and start generating visualizations. The project supports full functionality with additional setup for a credit system, requiring integration with Supabase, Stripe, and NextAuth with Google. This makes it a flexible solution for developers and data enthusiasts looking to integrate AI-powered chart generation into their workflows or projects. The tool is built with TypeScript, CSS, and JavaScript, indicating a modern web-based application.

serena

serena

61%

Serena is an advanced toolkit designed to function as an IDE for AI coding agents, offering semantic retrieval, editing, refactoring, and debugging capabilities. It integrates with any client/LLM via the Model Context Protocol (MCP), enabling agents to operate faster and more reliably, especially in large and complex codebases. Serena supports over 40 programming languages through its language server backend and leverages JetBrains IDEs' powerful code analysis via a paid plugin. Its agent-first tool design uses robust high-level abstractions, distinguishing it from approaches relying on low-level concepts. Serena also includes basic utilities like file search, shell command execution, and a memory management system for long-lived agent workflows.

rust-bert

rust-bert

61%

rust-bert is a Rust-native library offering ready-to-use Natural Language Processing (NLP) pipelines and transformer-based models. It serves as a port of Hugging Face's Transformers library, leveraging `tch-rs` for Libtorch bindings or `onnxruntime` for ONNX support, and `rust-tokenizers` for preprocessing. The library supports a wide array of NLP tasks including question answering, named entity recognition, translation, summarization, text generation, conversational agents, and more. It features multi-threaded tokenization and GPU inference for efficient processing. Users can get started with tasks like question answering with just a few lines of code, making it a powerful tool for integrating advanced NLP capabilities into Rust applications.

Step1X-Edit

Step1X-Edit

61%

Step1X-Edit is a state-of-the-art open-source image editing model designed to rival the performance of proprietary models such as GPT-4o and Gemini 2 Flash. It leverages a Multimodal LLM to process reference images and user instructions, integrating a latent embedding with a diffusion image decoder for target image generation. The model supports advanced features like native reasoning edit, which combines instruction reasoning with reflective correction for complex edits. It also offers improved image editing quality and better instruction-following performance. Step1X-Edit provides support for text-to-image generation, Lora finetuning, and various optimizations for GPU memory usage and multi-GPU inference, making it a powerful and flexible tool for image manipulation.