💻

Coding & Development

Browsing page 9 of AI tools for Testing & QA in Coding & Development. Sorted by confidence score — our independent quality rating.

All Backend & APIs Code Assistants Coding Agents Database & SQL DevOps & Infrastructure Documentation Frontend & UI Game Development Mobile Development No-Code / Low-Code Open Source & Models Prompt Engineering Testing & QA Vibe Coding Web Scraping & Automation

DevoxxGenieIDEAPlugin

62%

DevoxxGenieIDEAPlugin is a comprehensive Java-based LLM Code Assistant plugin for IntelliJ IDEA, designed to enhance developer workflows. It seamlessly integrates with a wide array of local LLM providers such as Ollama, LMStudio, GPT4All, and Llama.cpp, as well as cloud-based LLMs including OpenAI, Anthropic, and Gemini. Key features include Security Scanning with Gitleaks, OpenGrep, and Trivy, which automatically creates prioritized tasks for findings. The plugin also supports Spec Driven Development, allowing users to define tasks in Backlog.md, browse them in a Spec Browser, and let the AI agent implement them autonomously. Additionally, it offers AI-powered inline code completion, ACP Runners for external agent communication, and CLI Runners for executing prompts via external tools, making it a versatile solution for AI-augmented programming.

SiteSnapshot.io

62%

SiteSnapshot.io offers automated visual health checks for business owners, agencies, and developers, ensuring website integrity beyond simple uptime monitoring. It renders your site in a real Chrome browser, capturing high-resolution screenshots to visually verify that your business is open and functioning correctly. The tool detects critical issues like broken layouts, missing elements, and blank screens that traditional ping monitors often miss. With features like Precision Diff, mobile-aware checks, and white-label reports, SiteSnapshot helps users justify retainer fees, manage multiple client sites, and proactively identify visual regressions before they impact users or sales. It's designed for a no-code setup, making visual monitoring accessible to non-developers.

Datagon AI

62%

Manex AI, powered by its AI-driven Manufacturing Optimization Agent Qualitatio, transforms industrial quality management by enabling autonomous factories. This platform optimizes production processes from development through production to the customer, addressing major challenges manufacturers face. Key benefits include finding up to 20% more defects, reducing test volumes by up to 70%, and saving up to 35% time on rework. Qualitatio also offers up to 90% faster problem focus detection across various process parameters. The system provides defect prediction, a comprehensive process overview with real-time insights, personalized analytics, and seamless integration with robotics for full-cycle process monitoring. Trusted by industry leaders like BMW, Audi, and Siemens, Manex AI is ISO 27001 & TISAX certified, ensuring stability and security for factories in over 10 countries.

Non finito

62%

Non finito is an AI tool designed for the comprehensive evaluation and comparison of multimodal machine learning models. It provides a structured environment for assessing model performance across diverse tasks, including entity tracking in language models, logical reasoning, and visual deductive reasoning. Users can create and manage custom evaluation sessions, input various prompts, and compare the outputs of different models side-by-side. The platform highlights examples such as RealWorldQA and counting cards, demonstrating its utility for detailed analysis of AI capabilities. Non finito aims to offer a robust solution for researchers and developers to benchmark and understand the strengths and weaknesses of various AI models.

Refact.ai

62%

Refact.ai is an open-source AI coding agent designed to enhance software development workflows. It acts as an autonomous AI agent, capable of planning, executing, and deploying coding tasks end-to-end, integrating with various tools like GitHub and databases. The platform offers context-aware chat, accurate autocompletions powered by Qwen2.5-Coder and RAG, and supports over 25 programming languages. A key differentiator is its on-premise deployment option, ensuring data privacy and control. Refact.ai learns and evolves with user interaction, adapting to specific workflows and codebases, making it a powerful and customizable solution for individual developers and enterprise teams alike.

Radium AI

62%

Radium AI provides a comprehensive monitoring and management solution for Robotic Process Automation (RPA) bots, leveraging AI to enhance operational efficiency. The platform aggregates bot-run details from various RPA instances, including UiPath, Automation Anywhere, and Blue Prism, into a central repository. Its powerful ML engine classifies bot errors, identifies root causes, and recommends appropriate actions, significantly reducing incident resolution times. Radium AI also features an in-built action library and a robust workflow engine for defining custom automated actions against bot failures. It integrates with ITSM systems like ServiceNow to auto-generate tickets, ensuring 24x7 digital worker support and providing a single pane of glass for unparalleled bot observability.

UVeye

62%

UVeye offers an AI-powered inspection platform that serves every step of the vehicle life cycle, from manufacturing and logistics to retail, warranty, and fleet management. Utilizing patented AI and 360° imaging, its systems scan the full exterior, undercarriage, and tires of vehicles in seconds. It detects issues like tire wear, misalignment, leaks, rust, underbody damage, cracks, and scratches, providing instant visual insights. The platform is trusted by industry leaders such as General Motors, Volvo, Toyota, and Amazon, and integrates seamlessly with existing systems to improve accuracy, simplify workflows, and create a unified, transparent experience. UVeye helps businesses achieve significant improvements in operational efficiency, risk reduction, and customer trust.

WFGY

62%

WFGY is an open-source AI Troubleshooting Atlas designed to help developers and engineers debug and optimize AI systems, particularly those involving RAG (Retrieval Augmented Generation) and AI agents. It provides a structured approach to identifying and resolving common AI workflow problems, featuring a '16-problem map' and a 'Global Debug Card'. The project has evolved through several versions, with WFGY 5.0 Avatar acting as a governed runtime for language and human-machine interaction, and Problem Map 3.0 offering a practical entry point for troubleshooting failing AI workflows. It emphasizes evaluation, governance, and reproducibility, offering tools like the Twin Atlas and Inverse Atlas for AI evaluation and problem reproduction.

Basin MCP

62%

Basin MCP is designed to enhance the reliability and accuracy of AI-generated code through comprehensive testing. This platform specifically targets the prevention of hallucinations, a common issue in AI code generation, by implementing robust testing methodologies. It ensures that the code produced by AI systems is dependable and performs as expected, reducing errors and improving overall software quality. By focusing on the integrity of AI-generated code, Basin MCP provides developers and QA professionals with a critical tool to maintain high standards in AI-driven development workflows. The platform's core objective is to deliver confidence in AI-powered coding solutions.

Tusk

62%

Tusk is an AI testing platform designed to help engineering teams prevent bugs and quality issues by generating automated tests and reviews. It leverages live traffic and business context to create test cases that catch real-world regressions, significantly boosting code coverage. Tusk supports API testing, unit testing, and integration testing, and features self-healing tests that automatically adapt to changes in business logic. The platform also offers PR review automation, regression detection, and code coverage analysis. It integrates with popular version control systems like GitHub and GitLab and supports various test frameworks such as Jest, Vitest, Mocha, RSpec, pytest, and JUnit. Tusk aims to accelerate the development cycle by enabling confident and faster shipping of code.

llm-security

62%

llm-security is a comprehensive resource and proof-of-concept repository dedicated to exploring novel vulnerabilities in application-integrated Large Language Models (LLMs). It specifically highlights the dangers of "indirect prompt injection," a new class of attack vectors that can lead to remote control of LLMs, data exfiltration, persistent compromises, and automated social engineering. The tool provides demonstrations across various LLMs, including GPT-4 and GPT-3, and shows how these attacks can affect code completion engines like Copilot. It serves as a critical resource for security researchers and developers to understand and mitigate significant roadblocks to the secure deployment of LLMs.

OpenCodeInterpreter

62%

OpenCodeInterpreter is a comprehensive suite of open-source code generation systems designed to significantly improve the capabilities of large language models (LLMs) in coding tasks. It achieves this by incorporating execution feedback and iterative refinement, allowing the LLM to dynamically adjust and improve generated code. The platform offers various models, including the OpenCodeInterpreter-DS, -CL, -GM, and -SC2 series, all open-sourced on Hugging Face. These models demonstrate enhanced performance on benchmarks like HumanEval and MBPP, particularly with the integration of execution feedback. The project also provides a local deployment demo, enabling users to generate and execute code, receive automated feedback, and engage in chat-based interactions for further refinement. It is supported by the Code-Feedback dataset, featuring 68K multi-turn interactions.

chinese-llm-benchmark

62%

chinese-llm-benchmark, also known as ReLE (Really Reliable Live Evaluation for LLM), is a continuously updated platform for evaluating Chinese AI large language models. It currently covers 375 models, including commercial options like ChatGPT, Google Gemini, Claude, and Ernie, as well as open-source models such as Llama, GLM, and Mistral. The benchmark offers multi-dimensional capability assessments across 7 domains, including education, healthcare, finance, law, reasoning, language, and agent/tool calling, with approximately 300 detailed sub-dimensions. Beyond providing rankings, it features a defect library containing over 2 million entries, facilitating research and improvement of large models. The platform also offers free evaluation services for private large models.

Citadel AI

62%

Citadel AI provides a unified platform for evaluating, monitoring, and governing AI systems across an organization. It helps engineering teams optimize AI quality, iterate rapidly, and deep-dive into model performance for LLM, vision, and tabular AI systems. The platform also minimizes AI risk by mitigating issues in AI safety, security, and compliance, offering features like out-of-the-box jailbreak testing for generative AI and ISO-compliant reports for predictive AI. Citadel AI aligns with internal policies and external standards, ensuring reliable and robust AI from experimentation to production. It offers products like Citadel Lens for AI system evaluation and LangCheck, an open-source library for multilingual LLM evaluation.

SuperCLUE

62%

SuperCLUE is a comprehensive benchmark designed for evaluating Chinese large language models (LLMs). It assesses LLMs across four primary capability quadrants: language understanding and generation, professional skills and knowledge, Agent intelligence, and safety, further breaking these down into 12 foundational abilities. The platform provides detailed rankings and reports, including a SuperCLUE total leaderboard, open multi-turn question leaderboard, and objective question leaderboard. It also features specific evaluations for AI Agent capabilities, focusing on tool use and task planning, which are critical for developing advanced AI assistants. SuperCLUE aims to provide a robust framework for researchers and developers to compare and improve Chinese LLMs.

Refiner

62%

Refiner is an AI-powered, open-source tool designed to assist developers with code refactoring and generation. It aims to enhance code quality and structure by providing intelligent suggestions and automated processes. The tool streamlines the refactoring workflow, allowing developers to maintain cleaner and more efficient codebases. Additionally, Refiner can generate new code snippets, accelerating development cycles and reducing manual coding efforts. Its open-source nature fosters community contributions and transparency, making it a flexible solution for various development environments.

Kritisi

62%

Kritisi is an AI-powered security audit explorer specifically designed for Solidity smart contracts. It provides a comprehensive solution for identifying vulnerabilities and potential security risks within blockchain applications. The tool supports multiple blockchains, offering developers and auditors a robust platform for in-depth security analysis. By leveraging artificial intelligence, Kritisi aims to streamline the auditing process, enhance the security posture of smart contracts, and mitigate risks associated with blockchain development. Its focus on Solidity contracts makes it a specialized tool for the Web3 ecosystem.

Automotive Visual Inspection AI

62%

Automotive Visual Inspection AI is a Chrome Extension designed to automate the visual inspection of automotive damages and defects. This AI-powered tool analyzes images or videos of vehicles to detect and classify a wide range of visual imperfections, including scratches, dents, rust, paint issues, and missing parts. By streamlining the inspection process, it significantly improves accuracy and efficiency for various applications. Users simply upload visual data, and the AI generates a detailed report of identified issues. It is ideal for automating vehicle inspections for insurance claims, pre-purchase assessments, and quality control in the automotive industry.

CodeWhizz

62%

CodeWhizz is an AI-powered platform designed to generate, debug, and tutor users in Python and JavaScript coding. It features an AI Autocoder capable of generating full scripts in seconds, an integrated CodeEngine for running and testing code directly within the browser, and a ScriptRepo to save and manage generated code. The platform aims to boost productivity, help users learn and improve programming skills, and debug existing code. With support for various packages like Matplotlib and Numpy, CodeWhizz provides a comprehensive environment for coders of all levels, from beginners to professionals, to streamline their development workflow.

Nova AI

62%

Nova AI is an AI agent designed to transform QA by automating the generation and maintenance of end-to-end tests. It leverages generative AI to create tests based on real user behavior, ensuring comprehensive coverage that mirrors how users actually interact with an application. This approach helps eliminate redundancy, amplify efficiency, and reduce maintenance overhead by automatically updating test steps when UI elements or business logic change. Nova AI integrates with CI/CD pipelines, providing swift feedback directly in pull requests and supporting parallel execution across multiple environments. It offers actionable insights through dashboards, breaking down coverage by critical paths like Checkout or Sign Up, allowing teams to focus their efforts effectively. The platform supports browser-based and API testing, maintaining custom Chrome images and other major browsers for comprehensive coverage.

3LC.AI

62%

3LC.AI offers a comprehensive platform for AI data preparation and optimization, specifically designed for computer vision models. It illuminates the 'black box' of AI by combining labeling, debugging, and diagnosis capabilities. The tool seamlessly integrates into existing model training processes with minimal code changes, supporting popular frameworks like Python, Hugging Face, Ultralytics YOLO, Detectron2, PyTorch, and Jupyter. Clients have reported significant improvements, including a 30x reduction in false positives, a 50% increase in true positive rates, and a 75% reduction in training time, leading to reduced costs and CO2 emissions. 3LC.AI also allows users to keep their data in its current location, supporting Azure, Amazon, Google Cloud, network storage, and local storage.

SQAI Suite

62%

SQAI Suite is an AI-native platform designed to revolutionize software testing by intelligently orchestrating testing activities, resources, and processes. It significantly accelerates the software development lifecycle by automating key QA tasks such as requirements analysis, test case generation, automation script creation, and test data management. The suite aims to transform quality from a bottleneck into a competitive advantage, enabling faster releases and budget savings. It integrates seamlessly with existing workflows, syncing documentation, code, and test plans in real-time, making it ideal for analysts, product owners, developers, testers, and leadership seeking predictable delivery and faster time-to-market.

フィーチャ株式会社／Ficha Inc.

62%

フィーチャ株式会社／Ficha Inc. specializes in AI technology, with a core focus on image recognition. They provide solutions for the automotive industry, including software development support for advanced driver-assistance systems (ADAS) and driver/occupant monitoring systems (DMS/OMS), which enhance safety and reduce accident risks by accurately perceiving vehicle surroundings and in-cabin conditions. Beyond mobility, Ficha Inc. offers DX-AI solutions that leverage generative AI and deep learning for high-precision recognition and analysis of unstructured data like documents and drawings. These solutions, such as Drawing-AI and AI-OCR, aim to improve business efficiency and support digital transformation by automating tasks typically reliant on human input, thereby reducing workload and increasing productivity. Their offerings are designed for flexible customization to address specific operational challenges.

Webo.AI

62%

Webo.AI is an AI-powered test automation platform designed for startups and fast-moving software teams. It leverages generative AI and patented AiHealing® technology to automatically create, execute, and self-maintain test cases, significantly reducing manual effort and improving software quality. The platform aims to reduce test time by 80%, cut production defects by 73%, and lower QA costs by 69%. Webo.AI supports AI-driven regression testing across both web and mobile applications and integrates with CI/CD pipelines. It offers a 14-day free trial, allowing users to generate AI-driven test cases and execute automation without requiring coding skills.

EXPLORE OTHER CATEGORIES

🎨 Content & Design 📊 Productivity & Business 🤖 AI Agents & Automation 📚 Research & Education 🧘 Wellness & Lifestyle 💼 Career Development 📈 Marketing & Growth 📉 Data & Analytics 💬 Customer Support & CX 💰 Finance 🛒 E-commerce