Coding & Development
Browsing page 188 of AI tools for Coding & Development. Sorted by confidence score — our independent quality rating.
Score your GitHub repo for AI coding agents
Twill's Scorecard offers a free tool to assess how ready a GitHub repository is for AI coding agents. It provides a score across seven dimensions, drawing insights from OpenAI's Agentic Legibility framework and the Assistants & Agents Build Hour framework. Users can analyze public GitHub repository URLs or sign in to connect and analyze private repositories. The scoring process runs within a hosted shell via the Responses API, offering a detailed evaluation to help developers optimize their codebases for AI-assisted development workflows. This tool is part of Twill's broader platform, which aims to automate bug fixes, features, and maintenance using autonomous coding agents.
OSWorld
OSWorld is an open-source benchmark designed to evaluate multimodal AI agents performing open-ended tasks in real computer environments. It offers a robust framework for researchers and developers to test and compare the capabilities of their AI agents. The platform supports various virtualization technologies like VMware, VirtualBox, and Docker, with ongoing support for cloud platforms such as AWS. Key features include parallel execution of experiments, detailed result logging with screenshots and video recordings, and tools for manual task examination. OSWorld aims to standardize the benchmarking process for AI agents, providing clear metrics for success rates across different domains like Office, Daily, and Professional tasks.
filter-pruning-geometric-median
filter-pruning-geometric-median is an open-source implementation of the Filter Pruning via Geometric Median method for accelerating deep convolutional neural networks. Developed in PyTorch, this tool enables researchers and developers to reduce the computational cost and memory footprint of their models without significant loss in accuracy. It supports both network-level and layer-level sparsity configurations, offering flexibility in how pruning is applied. The repository provides detailed usage instructions for integration with PyTorch and NNI, along with scripts for reproducing results on datasets like ImageNet and CIFAR-10, making it a valuable resource for model compression research and application.
ext-apps
ext-apps is the official repository for the specification and SDK of the Model Context Protocol (MCP) Apps protocol, offering a standardized way to deliver interactive UIs from MCP servers directly within AI chatbots. This tool allows developers to create dynamic user interfaces such as charts, forms, and dashboards that render inline in compliant chat clients like Claude and ChatGPT. It extends the core MCP specification by enabling tools to declare UI resources, which the host then fetches and displays in a sandboxed iframe, facilitating bidirectional communication. The SDK supports app developers building interactive Views, host developers embedding these Views, and MCP server authors registering tools with UI metadata. It includes agent skills to scaffold new apps, migrate existing OpenAI apps, and convert web apps to hybrid web + MCP Apps.
Otter
Otter is an open-source multi-modal model developed by EvolvingLMMs-Lab, built upon the OpenFlamingo architecture. It excels in instruction-following and in-context learning, trained extensively on the MIMIC-IT dataset, which comprises 2.8 million interleaved image-text/video instruction-response pairs. Otter supports various tasks, including scene comprehension, reasoning, and multi-round conversations, and can process both image and video inputs. The project also introduces OtterHD for fine-grained interpretations of high-resolution visual input and MagnifierBench for evaluating tiny object recognition. It provides training scripts, pre-trained weights, and supports integration with Hugging Face models.
flux2
flux2 is the official inference repository for FLUX.2 models, offering state-of-the-art visual intelligence for image generation and editing. It provides minimal inference code to run these tasks with FLUX.2 open-weight models, including the FLUX.2 [klein] family, which boasts sub-second generation on consumer GPUs. The tool supports text-to-image generation, single-reference image editing, and multi-reference image editing. It is designed for developers and researchers, allowing for local installation and execution. The repository also details different model variants, their capabilities, licensing (Apache 2.0 for some, FLUX Non-Commercial License for others), and hardware requirements, making it suitable for various use cases from real-time applications to fine-tuning and research.
OpenSandbox
OpenSandbox is a robust, open-source sandbox platform designed for AI applications, offering a secure, fast, and extensible runtime environment for AI agents. It provides multi-language SDKs in Python, Java/Kotlin, JavaScript/TypeScript, C#/.NET, and Go, along with unified sandbox APIs. The platform supports both Docker and high-performance Kubernetes runtimes, enabling local execution and large-scale distributed scheduling. OpenSandbox is ideal for scenarios such as Coding Agents, GUI Agents, Agent Evaluation, AI Code Execution, and RL Training. It features strong isolation with secure container runtimes like gVisor and Firecracker microVM, and includes built-in Command, Filesystem, and Code Interpreter implementations.
opik
Opik, built by Comet, is an open-source platform designed to streamline the entire lifecycle of LLM applications, from prototype to production. It empowers developers to evaluate, test, monitor, and optimize their models and agentic systems with comprehensive tracing of LLM calls, conversation logging, and agent activity. Key features include advanced evaluation capabilities like LLM-as-a-judge for tasks such as hallucination detection and RAG assessment, experiment management, and integration into CI/CD pipelines. Opik also offers production-ready scalable monitoring dashboards, online evaluation rules, and dedicated SDKs for prompt and agent optimization, along with guardrails for safe AI practices. It supports a wide array of frameworks and offers client SDKs for Python, TypeScript, and Ruby.
onepanel
Onepanel is an open-source, end-to-end computer vision platform designed to streamline the entire computer vision lifecycle. It provides a unified environment for labeling datasets, building models, training, tuning hyperparameters, deploying, and automating computer vision workflows. The platform is built to be flexible, supporting deployment on any cloud infrastructure as well as on-premises environments. By integrating various open-source projects like Argo, Couler, CVAT, JupyterLab, and NNI, Onepanel offers a comprehensive solution for machine learning and deep learning practitioners. It aims to simplify complex computer vision tasks from data preparation to model deployment and automation.
Visidon
Visidon specializes in AI-powered image and video enhancement software, delivering solutions that improve camera performance with high efficiency, low power consumption, and accuracy. Their technology includes intelligent image and video enhancement algorithms, revolutionary computer vision, and face recognition capabilities. Visidon's solutions are optimized for embedded AI technologies, making them suitable for devices with limited resources. The company's algorithms enhance embedded camera capabilities, particularly in challenging conditions like low-light, high dynamics, and zooming, and are applied across industries such as smart monitors, video conferencing, drone cameras, and smartphones.
prometheus-eval
Prometheus-Eval is a comprehensive open-source repository designed for evaluating Large Language Models (LLMs) in various generation tasks. It leverages powerful models like Prometheus and GPT-4 to provide robust assessments. The tool supports multilingual meta-evaluation benchmarks, with recent iterations like M-Prometheus outperforming previous open LLM judges on multilingual meta-evaluation benchmarks such as MM-Eval and M-RewardBench. It also offers strong performance in English, surpassing Prometheus 2 7B and 8x7B on RewardBench. Prometheus-Eval facilitates both absolute grading, which assigns a score from 1 to 5, and relative grading, which compares two responses. It supports local inference via vllm and integration with LLM APIs through litellm, allowing users to utilize powerful evaluator LLMs like GPT-4.
Shift - Le Hackathon Gen AI
Shift - Le Hackathon Gen AI is a premier 48-hour hackathon based in Nantes, France, designed for designers, developers, and product enthusiasts passionate about generative AI. Participants are challenged to create innovative Gen AI products, with a special focus on hacking and customizing existing tools to develop new features. The event provides a comprehensive experience, including expert coaching, insightful conferences on Gen AI and product development, user testing rounds, and a vibrant, rock-and-roll atmosphere. It's an opportunity to collaborate, learn from industry leaders like Maxime Thoonsen and Pierre Renaudin, and rapidly prototype groundbreaking AI solutions. The hackathon aims to foster creativity and technical skill in a supportive and engaging environment, culminating in the development of impactful Gen AI products.
PhiFlow
PhiFlow is an open-source simulation toolkit designed for machine learning and optimization, primarily written in Python. It offers a differentiable PDE solving framework that seamlessly integrates with popular machine learning frameworks such as NumPy, PyTorch, Jax, and TensorFlow. This integration allows users to leverage automatic differentiation for building end-to-end differentiable functions that combine learning models with physics simulations. PhiFlow supports a wide range of applications, particularly in fluid dynamics, with features like built-in PDE operations, a flexible web interface for live visualizations, and object-oriented design for extensibility. It enables reusable simulation code across different backends and dimensionalities, making it a versatile tool for researchers and developers.
PyTorch-StudioGAN
PyTorch-StudioGAN is a Pytorch library designed for implementing and researching Generative Adversarial Networks (GANs) for both conditional and unconditional image generation. It serves as a unified playground for machine learning researchers to compare and analyze new GAN ideas, offering implementations of 7 GAN architectures, 9 conditioning methods, 4 adversarial losses, and various regularization and augmentation modules. The library also provides an unprecedented-scale benchmark for generative models, including results from GANs, auto-regressive models, and Diffusion models. It supports multiple acceleration methods like distributed data-parallel training and mixed-precision training, ensuring flexibility and reproducibility in GAN research.
getspecstory
getspecstory is a comprehensive tool designed to transform AI development conversations into a searchable and shareable knowledge base. It captures, indexes, and makes every interaction with AI coding assistants searchable across all projects and tools. The platform offers local-first extensions for popular AI IDEs like Cursor and Copilot, as well as CLI tools such as Claude Code, Cursor CLI, and Codex CLI. All sessions are automatically saved locally to a `.specstory/history/` folder. Users can optionally sync their conversations to the SpecStory Cloud, which provides a centralized knowledge system with full-text search across all projects. This helps solve the problem of lost context, lack of global search, and fragile sharing methods, creating a personal AI coding knowledge base for developers.
Python-ai-assistant
Python-ai-assistant, also known as Jarvis, is an open-source voice-commanding AI assistant built with Python 3.8. It offers a range of functionalities including speech recognition, text-to-speech interaction, and the execution of various commands. Users can interact with Jarvis via voice or text to perform tasks such as opening web pages, playing music, checking weather, setting alarms, and performing basic calculations. The assistant supports asynchronous command execution and allows for easy customization of voice commands and configurable assistant names. It also keeps a history of commands and learned skills in MongoDB, making it a versatile tool for personal automation.
CeLLife Technologies Ltd.
CeLLife Technologies Ltd. specializes in AI-powered diagnostics, measurement, and quality control for the battery industry. Its patented AI measurement technology, Electrical Fingerprint (EFP™), enables rapid analysis of battery cells, modules, and systems, performing diagnostics up to 900 times faster than traditional methods. This technology significantly reduces waste and costs while maximizing the potential of every battery throughout its lifecycle, from manufacturing to second life. CeLLife's solutions cater to industries such as manufacturing, Battery Energy Storage Systems (BESS), and recycling, helping businesses ensure 100% production quality, catch defects early, and improve traceability. The tool aims to build confidence in batteries, protect margins, and contribute to a world powered by sustainable energy by preventing premature degradation and failures.
pytorch-frame
PyTorch Frame is a modular deep learning framework built upon PyTorch, specifically designed for heterogeneous tabular data. It supports various column types including numerical, categorical, text, time, and images, enabling the creation of sophisticated neural network models. The library provides a flexible architecture for implementing existing and future deep learning methods, featuring state-of-the-art models, user-friendly mini-batch loaders, and benchmark datasets. It also facilitates integration with diverse model architectures, including Large Language Models, allowing users to encode text data with embeddings and train alongside other complex semantic types. PyTorch Frame aims to democratize deep learning research for tabular data, making it accessible for both novices and experts.
gpt-llama.cpp
gpt-llama.cpp acts as an API wrapper around llama.cpp, running a local API server that mimics OpenAI's GPT endpoints. This enables existing GPT-based applications to function with local llama-based models instead of relying on OpenAI's services. The primary goal is to provide a cost-free and private alternative for running AI models, making it a drop-in replacement for GPT-3.5 or GPT-4 applications. It supports interactive mode for speedy responses within chat contexts and is compatible with macOS, Windows, and Linux. Key features include automatic adoption of llama.cpp improvements and support for various GPT-powered applications like chatbot-ui and Langchain.
Clonable
Clonable is a comprehensive tool designed for cross-border website and webshop translation and cloning. It enables users to easily duplicate their existing websites, translate the content into various languages, and optimize them for specific regional search engines. The platform supports a wide range of website types, including e-commerce sites built on platforms like WordPress, WooCommerce, Magento, and Shopware. Clonable aims to save businesses significant time and money compared to traditional translation and development methods, allowing them to go live in new markets within a week. It also offers SEO optimization benefits by providing unique domains for cloned sites, improving local search rankings. The tool allows for customization of cloned sites, including changing keywords, logos, and specific translations, ensuring a localized experience for foreign customers.
TensorDock
TensorDock offers an affordable and easy-to-use cloud GPU infrastructure designed for machine learning, AI, rendering, and cloud gaming. It provides access to a global fleet of GPU servers, including high-end models like NVIDIA H100 and A100, as well as consumer GPUs like RTX 4090, at significantly lower costs than traditional cloud providers. The platform emphasizes on-demand access with no quotas or commitments, allowing users to deploy a server in just 30 seconds. TensorDock also provides CPU cloud services for scientific computing and HPC workloads, root access with KVM virtualization, and a robust API for server management. It caters to a wide range of needs, from individual researchers to AI startups, ensuring secure and reliable enterprise-grade hardware.
redis-inference-optimization
redis-inference-optimization is a Redis module designed for serving tensors and executing deep learning graphs. Previously known as RedisAI, this tool acts as a "workhorse" for model serving, offering support for popular Deep Learning and Machine Learning frameworks such as PyTorch, TensorFlow, TensorFlow Lite, and ONNXRuntime. It maximizes computation throughput and reduces latency by adhering to data locality principles, while simplifying the deployment and serving of graphs through Redis's robust infrastructure. Although the project is no longer actively maintained or supported, it provides a valuable reference for integrating AI inference capabilities directly within a Redis environment. Users are directed to the Redis website for current AI offerings.
Delta Bravo AI
Delta Bravo AI specializes in building agentic AI systems tailored for highly regulated industries, including water utilities, wastewater management, and environmental compliance. Their suite of products, Data Mentor, Aquaspec, and PermitPro, are designed to address critical bottlenecks in these sectors. Data Mentor acts as an AI data assistant, Aquaspec focuses on water treatment optimization, and PermitPro streamlines environmental permitting processes. By leveraging agentic AI, Delta Bravo aims to enhance operational efficiency, ensure regulatory adherence, and accelerate America's reindustrialization through intelligent automation of complex, regulated workflows. The company's solutions are trusted by industries requiring robust and auditable AI applications.
gpts-works
GPTs Works is an open-source, third-party store for custom GPTs, providing a centralized platform for users to discover and access a wide array of AI agents. The project is structured into three main components: a website for browsing, an index system for vector-based GPTs search, and a browser extension to integrate third-party GPTs alongside ChatGPT's Explore page. It leverages technologies like Vercel for deployment, Postgres for data storage, and Zilliz Cloud for vector storage and search, making it a robust solution for managing and exploring custom GPTs. Developers can easily set up and contribute to the platform, with clear instructions for local development and deployment.