Coding & Development
Browsing page 25 of AI tools for Testing & QA in Coding & Development. Sorted by confidence score — our independent quality rating.
gemma-3-270m
gemma-3-270m is an AI chatbot that leverages the Gemma 3 (270M) language model, running efficiently on Ollama with just a single-core CPU. This tool is designed for users who need to experiment with and deploy AI models even with limited computational resources. It supports both the google/gemma-3-270m and google/gemma-3-270m-it models, providing flexibility for different applications. Users can input text prompts and receive generated responses, with options to customize output parameters such as context length, temperature, and repetition penalty. The platform is hosted as a Hugging Face Space, making it accessible for testing and development.
Jules.Google
Jules is an autonomous coding agent designed to streamline development workflows by taking on tasks developers often don't want to do. It integrates directly with GitHub, allowing users to select repositories and branches, then provide detailed prompts for tasks such as bug fixing, version bumping, or feature building. Jules utilizes the latest Gemini 3 Pro model to develop plans, fetches repositories to a Cloud VM, and provides a diff of proposed changes for quick review and approval. Once approved, Jules creates a pull request, enabling developers to easily merge changes. This allows developers to focus on more complex or preferred coding tasks while Jules handles the routine or less desirable work.
screenclip.fast
screenclip.fast is a browser extension designed for retroactive screen capture, allowing users to instantly save the last 2 minutes of their screen activity. It runs silently in the background, continuously buffering your screen, so you never miss a moment worth sharing. The tool prioritizes privacy by keeping all recordings and processing local to your device, with no account required for core features. Clips are automatically saved as WebM files to your downloads folder, ready for immediate sharing. It offers features like speed control, frame-by-frame scrubbing, and is built to be lightweight, ensuring minimal impact on browser performance. The extension is currently available for Chrome, with Edge support in progress.
Latta AI
Latta AI is an advanced AI tool specifically engineered to streamline the debugging process for developers. It integrates a robust bug detection system that allows users to record sessions and meticulously track application interactions, providing a comprehensive view of code behavior. Compatible with popular Integrated Development Environments (IDEs) such as Visual Studio Code and JetBrains, Latta AI seamlessly fits into existing developer workflows. Its core functionality includes automatically generating actionable tasks for code adjustments, significantly reducing the manual effort and time typically spent on debugging. This makes Latta AI an invaluable asset for developers looking to enhance their productivity and ensure code quality.
arbigent
Arbigent is an AI agent testing framework designed for modern applications across Android, iOS, and web platforms. It addresses the limitations of traditional UI testing by using AI agents to break down complex tasks into smaller, manageable scenarios, improving predictability and scalability. The framework features an intuitive UI for non-programmers to design test scenarios and a code interface for developers to execute them programmatically. Arbigent supports cross-platform and device compatibility, including D-pad navigation for TV interfaces. It optimizes AI understanding through UI tree optimization and annotated screenshots, and offers cost savings as an open-source solution. Key features include robust reliability with stuck screen detection and image assertion, flexible customization via custom hooks and Maestro YAML integration, and support for Model Context Protocol (MCP) for external tool integration. It also allows app-provided AI hints for better screen comprehension.
can-ai-code
Can-Ai-Code is an open-source project designed to evaluate the coding capabilities of AI models. Initially created to determine if language models could generate syntactically valid code, it has evolved beyond simple pass/fail metrics. The tool now focuses on measuring AI's reasoning abilities through parametric difficulty scaling, exploring how models handle increasing complexity and working memory stress. It identifies different cognitive fingerprints across model families like OpenAI, Qwen, and Llama, assessing not just accuracy but also efficiency and constrained performance. The benchmark is designed to evolve, becoming harder as models improve, ensuring continuous discrimination power in an advancing field.
MLIP Playground
MLIP Playground is a Hugging Face Space designed for running, testing, and comparing over 17 state-of-the-art universal MLIPs (Machine Learning Interatomic Potentials). This web interface hosts Streamlit applications, enabling users to interact with them through a simple browser UI. Users can provide required inputs, such as text, numbers, or files, via the app’s widgets to evaluate and compare different models. The platform is ideal for developers and researchers who need to quickly assess the performance and characteristics of various MLIPs without complex setup, offering a streamlined environment for model experimentation and validation.
Leaderboard
Leaderboard serves as a robust and comprehensive benchmarking platform specifically designed for Automatic Speech Recognition (ASR). It addresses the critical need for measurable performance in ASR systems by offering three core components: a TestSet Zoo, a Model Zoo, and a Benchmarking Pipeline. The TestSet Zoo includes a wide range of academic and SpeechIO-curated datasets covering various speech recognition tasks and scenarios in both English and Chinese. The Model Zoo comprises a collection of commercial APIs and open-source models for comparison. The platform provides a simple and well-specified pipeline for data preparation, recognition, post-processing, and error rate evaluation, enabling researchers and developers to easily benchmark, reproduce, and examine ASR systems.
pytorch-grad-cam
pytorch-grad-cam is an advanced AI explainability package for computer vision, built on PyTorch. It offers a comprehensive collection of Pixel Attribution methods, including GradCAM, HiResCAM, ScoreCAM, and many others, to help diagnose model predictions and understand their decision-making process. The tool supports a wide range of architectures, from common CNNs to Vision Transformers, and can be applied to advanced use cases such as classification, object detection, semantic segmentation, and embedding-similarity. It includes smoothing methods like `aug_smooth` and `eigen_smooth` to produce clearer CAMs, and boasts high performance with full support for batches of images. Additionally, pytorch-grad-cam provides metrics for evaluating the trustworthiness and performance of explanations, making it valuable for both model development and research into new explainability methods.
Transformer-SSL
Transformer-SSL is an open-source project offering the official implementation for "Self-Supervised Learning with Swin Transformers." This codebase is notable for including Swin Transformer as one of its backbones, enabling the evaluation of learned representations' transferring performance on downstream tasks like object detection and semantic segmentation. It features MoBY, a self-supervised learning approach combining MoCo v2 and BYOL, achieving high accuracy on ImageNet-1K linear evaluation with significantly fewer tricks than previous works. The project provides models and code for self-supervised learning, linear evaluation, and demonstrates strong performance when transferring to object detection and semantic segmentation tasks.
timm Attention Visualization
timm Attention Visualization is an AI tool designed to help users understand how deep learning models, specifically those from the timm (PyTorch Image Models) library, process visual information. By uploading an image and selecting a timm model, users can generate detailed attention maps and rollout visualizations. These visualizations highlight the specific parts of an image that the model focuses on when making predictions, offering insights into its decision-making process. This tool is invaluable for researchers, developers, and data scientists working with computer vision models, aiding in debugging, improving model interpretability, and enhancing overall model performance. It is hosted on Hugging Face Spaces, making it easily accessible for experimentation.
xplique
Xplique is a comprehensive Python toolkit designed to bring clarity to complex neural network models through state-of-the-art Explainable AI (XAI) techniques. Originally developed for TensorFlow models, it also offers partial compatibility with PyTorch. The library features modules for Attribution Methods, allowing users to compute explanations like Grad-CAM and Integrated Gradients across various tasks such as classification, regression, object detection, and semantic segmentation. It also includes Feature Visualization to understand how networks build their understanding, Concept Extraction to identify human concepts, and Metrics to evaluate the faithfulness and robustness of explanations. Xplique supports diverse data types including images, time series, and tabular data, making it a versatile tool for AI model analysis and debugging.
🐍💨 Data Contamination Database
The 🐍💨 Data Contamination Database is a Hugging Face Space designed to help users identify and manage data contamination within datasets and models. This application provides functionalities to filter and view data specifically related to contamination. Users can input particular evaluation datasets and contaminated sources, and then select various options to exclude or analyze these issues. It serves as a crucial resource for AI researchers and data scientists aiming to ensure the integrity and reliability of their data, ultimately leading to more robust and accurate AI models. The tool is hosted on Hugging Face Spaces, making it accessible for a wide range of users.
Synthetic Society
Synthetic Society is building synthetic users to automate end-to-end QA and UX testing, enabling rapid bug detection and superior user experiences. The platform integrates AI-driven user simulations directly into the development loop, allowing teams to ship products that are already tested and refined. Key features include real-time analytics to catch UX issues during development, AI-driven growth to close the developer feedback loop and eliminate manual testing, and precision user feedback from realistic synthetic user testing. It offers smart simulations where agents behave like real users to uncover bugs and design flaws, auto-generation of key user flows, and a friction finder to pinpoint where users get stuck. The tool provides full visibility into every step, click, and hesitation in the user journey.
IIIF Illustration Detector
The IIIF Illustration Detector is an AI-powered tool hosted on Hugging Face that helps users identify illustrated pages within digitized historical books. By simply entering a IIIF manifest URL or selecting a sample, the application scans every page directly in the user's browser. It leverages a small AI model to detect and categorize pages containing illustrations, photographs, maps, or diagrams. This tool is particularly useful for researchers, historians, and digital humanities professionals who need to quickly pinpoint visual content within large collections of digitized texts, streamlining the process of content discovery and analysis.
Metrics
Metrics is an open-source toolbox offering implementations of various supervised machine learning evaluation metrics across multiple programming languages. Developers and researchers can utilize this tool to assess model performance in Python, R, Haskell, and MATLAB/Octave environments. It includes a wide array of metrics such as Absolute Error, Area Under the ROC, F1 Score, Log Loss, Mean Absolute Error, Mean Squared Error, and Root Mean Squared Error. The project is currently in a beta release, focusing on ensuring compatibility and functionality across its supported language repositories. It aims to provide a comprehensive suite for evaluating machine learning models.
mmrazor
mmrazor is a comprehensive model compression toolkit and benchmark developed as part of the OpenMMLab project. It offers four mainstream technologies: Neural Architecture Search (NAS), Pruning, Knowledge Distillation (KD), and Quantization. Designed for flexibility and compatibility, mmrazor can be easily integrated with various OpenMMLab projects and allows for plug-n-play incorporation of different algorithms. Its modular design enables developers to implement new model compression algorithms with minimal code or by modifying configuration files. The toolbox supports a wide range of algorithms within each category, including DARTS, DetNAS, SPOS for NAS; AutoSlim, L1-norm, Group Fisher, DMCP for pruning; and various methods like CWD, WSLD, ABLoss for KD. It also includes PTQ, QAT, and LSQ for quantization, making it a versatile tool for optimizing deep learning models.
Open Persian ASR Leaderboard
The Open Persian ASR Leaderboard is a platform designed for evaluating and ranking Automatic Speech Recognition (ASR) models specifically for the Persian language. It enables users to submit their own ASR models by providing the model name in the format "user_name/model_name" and have them assessed against a standardized benchmark. This tool facilitates comparison of different models, helping researchers and developers identify top-performing ASR systems for Persian. The leaderboard provides a transparent and accessible way to track advancements and performance metrics in Persian ASR, fostering competition and innovation within the field.
Ministry of Testing
Ministry of Testing, also known as MoTaverse, is a leading community for software testers, QA, and quality engineers, boasting over 100,000 professionals. For over 15 years, it has served as a central hub for career development through various offerings. Members can access a wide range of resources including online courses, certifications like the MoT Software Quality Engineering Certificate, and insights from industry experts. The platform hosts numerous events, both online and in-person, such as MoTaCon, local chapter meetups, and workshops covering topics from AI in QA to API testing. It fosters a vibrant community where professionals can connect, share knowledge, and contribute to the collective growth of the software testing field.
ROC
ROC is a leading Vision AI platform that provides advanced multimodal biometric capabilities, including top-ranked face, fingerprint, and iris recognition. The platform also excels in object detection, gun detection, license plate recognition, and tattoo recognition. Trusted by the U.S. military, law enforcement, and global FinTech brands, ROC offers a unified platform that fuses biometrics, video analytics, and mission intelligence. Its solutions are designed to be lightweight, optimized for mobile and edge deployment, and built on ethically sourced datasets to ensure high fidelity and near-zero bias. ROC's products include the ROC SDK for integrating biometric and Vision AI capabilities, ROC Watch for 24/7 proactive AI monitoring and threat detection, and ROC ABIS for supercharging biometric identification.
JDoodle
JDoodle is a comprehensive online compiler, editor, and Integrated Development Environment (IDE) designed to support over 110 programming languages. It offers a convenient platform for developers and students to write, compile, and execute code directly in their web browser without needing local installations. Users can run programs on the fly, save their code for future reference, and easily share their projects with others. The platform aims to provide a quick and easy way to compile and run programs online, making it an accessible tool for learning, testing, and collaborating on various coding tasks. It supports popular languages like Java, C, C++, PHP, Perl, Python, and Ruby, among many others.
Screen Url
Screen Url offers a simple REST API for developers to capture website screenshots quickly and efficiently. With a single API call, users can generate pixel-perfect images of any URL, making it ideal for social media previews, automated testing, website monitoring, documentation, and content aggregation. The service boasts lightning-fast screenshot rendering, typically under 2 seconds, and guarantees 99.9% uptime. It supports full-page capture, custom viewport dimensions up to 4K resolution, and allows for delays to ensure JavaScript rendering. Users can choose between PNG and JPEG formats, and the API also supports PDF export. A free tier is available, offering 100 screenshots per month without requiring a credit card.
awesome-computer-vision-models
awesome-computer-vision-models is a comprehensive, curated list of popular deep learning models specifically designed for computer vision tasks. This open-source repository serves as a valuable resource for researchers and engineers, offering detailed information on classification, segmentation, and detection models. Each entry includes crucial evaluation metrics such as the number of parameters, FLOPS, and various error rates (e.g., Top-1 Error, Top-5 Error, mIOU), along with the publication year. The repository helps users quickly identify and compare models based on their performance and resource requirements, facilitating informed decisions for their projects. It's an essential reference for anyone working with deep learning in computer vision.
godot-mcp
godot-mcp is an open-source Model Context Protocol (MCP) server specifically designed for seamless interaction with the Godot game engine. This tool empowers AI agents to directly control and monitor Godot projects, facilitating a robust feedback loop for development and debugging. Key functionalities include launching the Godot editor, executing projects in debug mode, capturing console output and error messages, and programmatically controlling project execution. It also offers features for retrieving Godot versions, listing projects, analyzing project structures, and managing scenes by creating new ones, adding nodes, loading sprites, and exporting 3D scenes. For Godot 4.4+, it supports UID management, allowing for the retrieval and updating of file UIDs.