Coding & Development
Browsing page 122 of AI tools for Open Source & Models in Coding & Development. Sorted by confidence score — our independent quality rating.
ollama-grid-search
ollama-grid-search is a multi-platform desktop application designed to evaluate and compare Large Language Models (LLMs). Written in Rust and React, it automates the process of selecting optimal models, prompts, or inference parameters for a given use case. Users can iterate over various combinations and visually inspect the results, making it an invaluable tool for prompt engineering and model selection. The application assumes Ollama is installed and serving endpoints, either locally or on a remote server. Key features include automatic fetching of models from Ollama servers, A/B testing of prompts, a fully functional prompt database, and the ability to list, inspect, and re-run past experiments.
PaddleViT
PaddleViT, or PPViT, is an open-source collection of state-of-the-art Visual Transformer and MLP Models specifically designed for PaddlePaddle 2.0+. It goes beyond traditional convolutional neural networks by offering a wide array of vision models based on Visual Transformers, Visual Attentions, and MLPs. The tool integrates popular layers, utilities, optimizers, schedulers, data augmentations, and training/validation scripts to facilitate the reproduction of cutting-edge ViT and MLP models. PaddleViT supports multiple vision tasks including image classification, object detection, semantic segmentation, and GANs, with each model architecture defined in a standalone Python module for easy modification and research. It also provides pretrained weights for fine-tuning on custom datasets and includes tools for customized datasets, data preprocessing, performance metrics, and DDP for high-performance training.
parameter_efficient_instruction_tuning
parameter_efficient_instruction_tuning is an open-source repository dedicated to the systematic comparison of various parameter-efficient fine-tuning (PEFT) methods for instruction tuning tasks. The project utilizes the SuperNI dataset as its primary benchmark for training and evaluation. Implementations of PEFT methods are adapted from well-known libraries such as adapter-transformers and peft. The repository includes bash scripts for running experiments, optimized for the hfai HPC platform, supporting features like experiment configuration, checkpoint management, and training state validation. It also addresses platform-specific considerations like PyTorch and CUDA compatibility, making it a valuable resource for researchers and developers working on efficient large language model fine-tuning.
Point-BERT
Point-BERT is a PyTorch implementation of a novel pre-training paradigm for 3D point cloud Transformers, introduced in CVPR 2022. Inspired by BERT, it utilizes a Masked Point Modeling (MPM) task where point clouds are divided into local patches, and a discrete Variational AutoEncoder (dVAE) tokenizes these patches. The pre-training objective involves recovering original point tokens at masked locations, supervised by the dVAE's output. This method significantly advances the capabilities of Transformers for 3D data, facilitating tasks like classification on ModelNet40 and ScanObjectNN, few-shot learning, and part segmentation on ShapeNetPart. It is an essential tool for researchers and engineers working with 3D point cloud analysis.
pipelines
Kubeflow Pipelines is a core component of the Kubeflow platform, designed to simplify and scale machine learning (ML) workflows on Kubernetes. It provides end-to-end orchestration capabilities, making it easier to build, deploy, and manage complex ML pipelines. The service focuses on enabling easy experimentation, allowing users to quickly iterate on ideas and manage various trials. Furthermore, it promotes re-use of components and pipelines, accelerating the development of ML solutions without constant rebuilding. Kubeflow Pipelines leverages Argo Workflows for orchestrating Kubernetes resources and offers a Python SDK for defining pipelines, along with comprehensive API documentation.
rome
ROME (Rank-One Model Editing) is an open-source tool designed for researchers and developers to precisely locate and modify factual associations within large language models, specifically GPT-2 XL and GPT-J. This GPU-only implementation allows for targeted editing of model knowledge without extensive retraining. It provides functionalities for causal tracing to understand model behavior and a straightforward API for specifying rewrite requests. The repository includes evaluation suites for benchmarking editing methods against CounterFact, making it a valuable resource for advancing research in model interpretability and editability. Users can also integrate new editing methods for comparative analysis.
Causal Foundry
Causal Foundry offers Kenkai, an adaptive AI platform designed for real-time personalization, optimization, and scalable decision-making. Built on ClickHouse, Kenkai streams and queries high-resolution data instantly, enabling enterprise-scale interventions. It leverages reinforcement learning and contextual bandits to continuously optimize engagement strategies through experimentation and adaptation. The platform also includes embedded metrics and analytics, allowing users to define governed metrics once and explore them everywhere, integrating live dashboards directly into existing systems without black boxes. Causal Foundry aims to democratize reinforcement learning for organizations worldwide, adapting to individual preferences, environments, and behaviors.
SalesGPT
SalesGPT is an open-source AI Sales Agent designed to automate sales outreach with context-aware capabilities. It can understand various stages of a sales conversation, from introduction to closing, and act accordingly. The tool integrates with pre-defined product knowledge bases to significantly reduce AI hallucinations and can connect to any data system via Mindware. Key features include automated email communication, Calendly meeting scheduling, and the ability to generate Stripe payment links for closing sales. SalesGPT supports various LLMs through LiteLLM and is optimized for low-latency voice conversations, boasting sub-1-second response times. It also offers enterprise-grade security and human-in-the-loop supervision.
segmentation_models.pytorch
segmentation_models.pytorch is an Open Source Python library designed for semantic image segmentation using PyTorch. It provides a high-level API that allows users to create neural networks with minimal code, supporting 12 encoder-decoder model architectures such as Unet, Unet++, Segformer, and DPT. The library boasts an extensive collection of over 800 pretrained convolutional and transformer-based encoders, including timm support, which helps achieve faster and more stable convergence during training. It also includes popular metrics and losses for training routines, such as Dice and Jaccard, and is compatible with ONNX export and torch script/trace/compile. This makes it a versatile tool for researchers and practitioners in computer vision.
seldon-server
Seldon-server is an open-source machine learning platform designed to help data science teams deploy models into production within a Kubernetes cluster. While this specific project is archived and superseded by Seldon Core, it laid the groundwork for serving a wide range of ML models, including those built with TensorFlow, Keras, Vowpal Wabbit, XGBoost, and Gensim. It features an API with Predict and Recommend endpoints for supervised machine learning models and high-performance recommendation engines, respectively. Other capabilities include dynamic algorithm configuration for A/B and Multivariate tests, a Command Line Interface (CLI), secure OAuth 2.0 REST and gRPC APIs, and a Grafana dashboard for real-time analytics. Seldon-server supports deployment on-premise or in the cloud (e.g., GCP, AWS, Azure).
Self-Driving Delivery Agent
Self-Driving Delivery Agent, also known as DriVLMe, is an open-source project providing the official implementation of the IROS 2024 paper: "Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experience." This tool is designed for researchers and developers working on autonomous driving systems, particularly those interested in integrating large language models (LLMs) with real-world driving experiences. It offers a framework for setting up a conda environment, preparing LLaVA weights, and training/finetuning models on datasets like bddx and SDN. The project includes scripts for pretraining, finetuning, and evaluating autonomous driving agents, making it a valuable resource for advancing the field of AI-driven autonomous vehicles.
Focoos AI
Focoos AI reshapes computer vision by offering ultra-efficient models designed to reduce costs, automate hardware integration, and ensure peak performance across various devices. The platform allows ML Engineers to train, deploy, and iterate models faster than ever, supporting both cloud and edge environments. Its models are engineered for speed, delivering up to 10x faster inference and being 4x lighter in compute and memory compared to mainstream alternatives. Focoos AI provides pre-trained, production-ready models that can be instantly deployed and easily fine-tuned. It features an all-in-one platform for managing, comparing, monitoring, and deploying models, alongside an open-source library for community collaboration and local use. The tool emphasizes security, control, and sustainability, making it suitable for applications in manufacturing, smart cities, and autonomous systems.
TensorLayer
TensorLayer is a powerful, open-source deep learning and reinforcement learning library built for scientists and engineers. It offers an extensive collection of customizable neural layers, enabling rapid development of advanced AI models. Inspired by PyTorch, TensorLayer provides transparent and flexible APIs, making it easier to build and train complex AI models compared to other TensorFlow wrappers. It supports multiple backends including TensorFlow, PyTorch, MindSpore, PaddlePaddle, OneFlow, and Jittor, allowing deployment on various hardware like Nvidia-GPU and Huawei-Ascend. The library is recognized for its simplicity, flexibility, and high performance, with comprehensive documentation and a large community.
tiefvision
tiefvision is an integrated end-to-end image-based search engine powered by deep learning. It offers comprehensive functionalities including image classification, image location (based on OverFeat), and image similarity (based on Deep Ranking). The system is built using Torch for its deep learning modules and the Play Framework (Scala version) for its tooling modules. It currently supports Linux operating systems with CUDA-enabled GPUs, indicating a focus on performance-intensive image processing tasks. Beyond its core deep learning capabilities, tiefvision also provides a suite of web tools designed to streamline dataset generation and enhance productivity, such as visual database editors and automated dataset generation for training and testing.
Top2Vec
Top2Vec is an open-source Python library designed for advanced topic modeling and semantic search. It automatically detects topics within text data and generates jointly embedded topic, document, and word vectors. The library offers a 'classic' version for general topic modeling and a newer 'contextual' version that leverages contextual token embeddings to identify multiple topics per document and even detect topic segments within documents. This contextual approach provides a more nuanced understanding of complex texts. Key features include automatic topic number detection, hierarchical topic generation, keyword-based topic search, and document search by topic or keywords. Top2Vec eliminates the need for stop word lists, stemming, or lemmatization, and works effectively on short texts. It also supports various embedding models like Doc2Vec, Universal Sentence Encoder, and BERT Sentence Transformer for flexible deployment.
torch-audiomentations
torch-audiomentations is a PyTorch library designed for efficient audio data augmentation, crucial for deep learning applications. It prioritizes speed by supporting both CPU and GPU (CUDA) processing, making it suitable for large-scale model training. The library handles batches of multichannel or mono audio and its transforms extend `nn.Module`, allowing direct integration into PyTorch neural network models. Most transforms are differentiable, offering flexibility for advanced use cases. It features three modes—per_batch, per_example, and per_channel—for applying augmentations, along with a permissive MIT license and cross-platform compatibility. The library includes a variety of waveform transforms such as Gain, PolarityInversion, AddBackgroundNoise, PitchShift, and various filters, aiming for high test coverage and continuous development.
verl-tool
Verl-Tool is a comprehensive framework designed for training AI agents that can effectively use diverse tools. It offers a unified and easy-to-extend architecture, leveraging verl as a submodule to benefit from ongoing updates. Key features include a complete decoupling of actor rollout and environment interaction, a "tool-as-environment" paradigm where each tool interaction can modify and reload environment states, and native RL framework support for multi-turn interactive loops. The platform also provides a user-friendly evaluation suite, allowing users to launch trained models with OpenAI API alongside a tool server for seamless interaction and output generation. It supports the latest verl (0.6.0) and vllm (0.11.0) versions, ensuring modularity and maintainability.
ui-ux-pro-max-skill
UI UX Pro Max is an powerful AI skill designed to provide comprehensive design intelligence for building professional UI/UX across various platforms and frameworks. Its flagship feature is the Design System Generator, an AI-powered reasoning engine that analyzes project requirements and instantly creates a complete, tailored design system. This includes recommendations for patterns, styles, colors, typography, and key effects, along with anti-patterns to avoid and a pre-delivery checklist. The tool incorporates 161 industry-specific reasoning rules, 67 UI styles, 161 color palettes, 57 font pairings, and 99 UX guidelines, making it a robust solution for designers and developers seeking to streamline their UI/UX workflow and ensure design consistency.
VLA-Adapter
VLA-Adapter is an open-source implementation offering an effective paradigm for tiny-scale Vision-Language-Action (VLA) models. It provides a robust framework for training and deploying VLA models, particularly for robotic control and real-world system integration. The tool supports various GPU configurations, from extremely limited VRAM (10-12GB) to professional-grade GPUs (80GB+), making it accessible for diverse research and development environments. Key features include support for LIBERO and CALVIN benchmarks, an enhanced Pro version for improved performance, and compatibility with various foundation models and real-world robotic systems like ALOHA and Franka. It also offers detailed guidance on data preparation and training configurations.
yolo-9000
YOLO9000 is an open-source project providing a real-time object detection system capable of identifying a vast array of 9000 different object classes. This tool, available on GitHub, is ideal for researchers and developers in the computer vision field. It offers instructions for setting up and running the detection system on various operating systems including Ubuntu/Linux/Mac OS and Windows. Users can configure it for CPU or GPU support, with GPU acceleration significantly improving inference speed. The repository also includes guidance on making videos from detected objects and converting weights to Keras, making it a versatile resource for advanced object detection tasks.
Watcher
Watcher is an open-source AI-powered cyber threat intelligence and hunting platform developed with Django and React JS. It empowers security operations with comprehensive threat detection and monitoring capabilities, including AI-driven threat intelligence that transforms raw data into actionable insights with automated weekly digests and real-time breaking news alerts. The platform also features emerging threat detection via RSS feeds, legitimate domain management, information leak monitoring across various platforms, and malicious domain surveillance with automatic RDAP/WHOIS checks. Watcher can be deployed on web servers or quickly run via Docker, and integrates with tools like TheHive and MISP for collaborative threat intelligence sharing.
BigCode - Playground
BigCode - Playground is an AI tool designed for code experimentation and model testing, hosted on Hugging Face Spaces. It serves as a platform for developers and AI enthusiasts to interact with and test various code models. While the live website currently indicates a runtime error, suggesting it may not be fully operational at this moment, its intended purpose is to provide a space for exploring and validating code-related AI functionalities. The tool is part of the BigCode initiative, aiming to foster community engagement in the development and application of large language models for code.
53AIHub
53AI Hub is an open-source AI portal designed to help developers and enterprises quickly build and operate production-grade AI agents, prompts, and tools. It offers seamless integration with popular development platforms such as Coze, Dify, FastGPT, RAGFlow, and 53AI Studio, as well as cloud platforms like Aliyun, Tencent Cloud, and Baidu Cloud. The platform simplifies the creation of AI portals, even for users without extensive technical backgrounds, significantly lowering the barrier to AI implementation. Key features include platform integration, comprehensive application management for AI assets, user operations management, and independent deployment options for both cloud and local environments.
flops-counter.pytorch
flops-counter.pytorch is an open-source tool designed to calculate the theoretical number of multiply-add operations (FLOPs) and parameters within neural networks built using the PyTorch framework. It offers two backends: 'pytorch' for legacy nn.Modules with better per-layer analytics for CNNs, and 'aten' for broader coverage of model architectures, including transformers, by considering aten operations. The tool can also print per-layer computational costs and allows for ignoring specific modules during counting. It supports various layers like Conv1d/2d/3d, BatchNorm, Activations, Linear, Upsample, and Poolings, with experimental support for RNNs, LSTMs, GRUs, and MultiheadAttention. Users can customize input tensors for complex models and view verbose output for unconsidered operations.