Coding & Development
Browsing page 124 of AI tools for Open Source & Models in Coding & Development. Sorted by confidence score — our independent quality rating.
Uni-ControlNet
Uni-ControlNet is an advanced AI tool designed to offer comprehensive control over text-to-image diffusion models. It provides an all-in-one method for controllable image synthesis, allowing users to precisely guide the generation process. The tool unifies various control aspects, simplifying the creation of specific image outputs. Based on research presented at NeurIPS 2023, Uni-ControlNet aims to enhance the flexibility and accuracy of AI-driven image generation, making it a valuable resource for researchers and developers working with diffusion models.
UCR_Time_Series_Classification_Deep_Learning_Baseline
UCR_Time_Series_Classification_Deep_Learning_Baseline is an open-source repository designed to provide a foundational deep learning model for time series classification. It specifically utilizes fully convolutional neural networks (FCNs) to establish a robust baseline for research and application. The tool is tailored for univariate time series data, making it suitable for a wide array of domains including finance, industrial applications, and healthcare, where time-dependent data analysis is crucial. It supports both representation learning and classification tasks, offering a valuable resource for data scientists and researchers looking to explore or implement deep learning solutions for time series analysis.
UER-py
UER-py (Universal Encoder Representations) is an open-source framework designed for pre-training on general-domain corpora and fine-tuning on downstream NLP tasks using PyTorch. It emphasizes model modularity, allowing users to combine various embedding, encoder, decoder, and target modules to construct custom pre-training models. The toolkit supports CPU, single GPU, and distributed training modes, making it versatile for different computational environments. UER-py also provides a comprehensive model zoo with pre-trained models of diverse properties, facilitating their direct use in various applications. It has been tested for reproducibility against original implementations of models like BERT, GPT-2, ELMo, and T5, and offers solutions for numerous NLP competitions.
voicebox
voicebox is an open-source voice synthesis studio that leverages Qwen3-TTS to provide a private and customizable environment for voice generation. This tool enables users to clone existing voices, generate new speech, and develop various voice-powered applications directly on their local machines. By running locally, voicebox ensures privacy and offers extensive customization options, making it suitable for developers and content creators who require fine-grained control over their audio output. Its open-source nature fosters community contributions and allows for continuous improvement and adaptation to specific user needs, providing a flexible solution for advanced voice synthesis tasks.
Voice-Cloning-App
Voice-Cloning-App is an open-source Python/Pytorch application designed for easily synthesizing human voices. It offers key features such as automatic dataset generation, including support for subtitles and audiobooks, and additional language support. The tool facilitates both local and remote training, with easy start/stop functionality, and supports data importing/exporting, as well as multi-GPU setups. It is built upon a reworked version of Tacotron2 and integrates other technologies like DSAlign, Silero, DeepSpeech, and hifi-gan. The application is suitable for users running Windows 10 or Ubuntu 20.04+ with at least 5GB of disk space, and optionally an NVIDIA GPU with 4GB+ memory for enhanced performance.
text2image
text2image is an open-source project that implements a model for generating images from natural language descriptions. Based on research presented at ICLR 2016, this tool iteratively draws patches on a canvas while attending to relevant words in the provided description. It offers code for training models on datasets like MNIST with captions and Microsoft COCO, allowing users to generate images from their own textual inputs. The project is written in Python and requires specific dependencies like Theano, numpy, scipy, and h5py. It's ideal for researchers and developers interested in exploring attention-based image generation.
ClipBERT
ClipBERT is an official PyTorch code implementation for an efficient framework designed for end-to-end learning across image-text and video-text tasks. Recognized with a CVPR 2021 Best Student Paper Honorable Mention, ClipBERT processes raw videos/images and text inputs to generate task predictions. It leverages 2D CNNs and transformers, incorporating a sparse sampling strategy to enable efficient multimodal learning. The framework supports end-to-end pretraining and finetuning for tasks such as image-text pretraining on COCO and VG captions, text-to-video retrieval on MSRVTT, DiDeMo, and ActivityNet Captions, video-QA on TGIF-QA and MSRVTT-QA, and image-QA on VQA 2.0. Its modular design allows for easy integration of additional image-text or video-text tasks.
synthcity
synthcity is a comprehensive open-source Python library designed for generating and evaluating synthetic tabular data. It provides a flexible, plugin-based architecture that allows for easy extension and integration of new models. The library includes a wide array of reference models, categorized by type, such as GAN-based (AdsGAN, CTGAN), VAE-based (TVAE), Normalizing Flows, Bayesian Networks, and LLM-based (GReaT) for general-purpose data. It also features specialized generators for time series (TimeGAN, FourierFlows), static survival analysis (SurvivalGAN), and even images (Image ConditionalGAN). synthcity emphasizes privacy-focused generation with models like DECAF and DP-GAN, and offers several evaluation metrics for correctness and privacy. It's ideal for researchers and developers working on data privacy, fairness, and augmentation tasks, though it requires prior imputation for missing data.
speech
Speech is an open-source Python package designed to facilitate research and development in end-to-end models for automatic speech recognition (ASR). It provides implementations of various ASR architectures, including sequence-to-sequence models with attention mechanisms, Connectionist Temporal Classification (CTC), and the RNN Sequence Transducer. Built on PyTorch, this tool allows researchers and developers to experiment with and build advanced speech-to-text systems. The software is specifically tested for Python 3.6 and does not provide backward compatibility for Python 2.7, ensuring a modern development environment. It includes examples for model configurations and datasets, making it easier to get started with training and evaluating ASR models.
Teachable Machine
Teachable Machine is a web-based tool developed by Google that simplifies the creation of machine learning models. It enables users to train a computer to recognize their own images, sounds, and poses through a guided, intuitive interface. The platform is designed for accessibility, requiring no prior machine learning expertise or coding knowledge. Users can gather examples, train their models, and then export them for integration into websites, apps, and other projects. This makes it an ideal tool for rapid prototyping and educational purposes, allowing individuals to explore the capabilities of machine learning in a straightforward manner.
vector-admin
vector-admin is an open-source, self-hostable tool suite designed for comprehensive vector database management. It offers a universal user interface to simplify interactions with various vector databases such as Pinecone, Chroma, Qdrant, and Weaviate. Users can view, update, and delete individual text chunks of embeddings, copy entire documents or namespaces without re-embedding costs, and upload new documents directly. The tool also supports migrating existing vector databases to different types or instances. While no longer actively maintained by Mintplex Labs, it remains functional for most providers and is cloud deployment ready, offering features like multi-user instance support and cost-saving measures for large documents.
Deci AI (Acquired by NVIDIA)
NVIDIA, which acquired Deci AI, is a world leader in artificial intelligence computing, inventing the GPU and driving significant advancements across numerous fields. Their platform provides a vast array of software tools and solutions, including cloud services like BioNeMo for life sciences research, DGX Cloud for AI factories, and NVIDIA APIs for deploying AI models. For creators, NVIDIA Studio offers high-performance PCs and AI-enhanced apps like Broadcast. Data centers benefit from platforms like DGX and HGX, while embedded systems leverage Jetson and DRIVE AGX for autonomous machines and vehicles. Gaming is enhanced with GeForce RTX graphics cards, DLSS, and cloud gaming via GeForce NOW. NVIDIA also provides extensive software for Agentic AI, Data Science, Robotics, and various industries, making it a comprehensive ecosystem for AI development and deployment.
LangChain
LangChain provides a comprehensive engineering platform and open-source frameworks designed for developers to build, test, and deploy reliable AI agents. The platform, LangSmith, offers robust tools for observability, allowing users to trace agent execution and understand complex interactions. It also includes evaluation capabilities to score and improve agent performance using real-world usage data and human feedback. For deployment, LangSmith supports shipping and scaling agents in production with features like memory, conversational threads, and durable checkpointing. Additionally, LangChain offers open-source frameworks like deepagents, langchain, and langgraph for building various types of agents, from quick-start prototypes to reliable production systems with low-level control.
Bricka
Bricka is an innovative AI-powered platform designed to enhance media literacy by enabling users to compare how different news outlets frame the same story. It meticulously aggregates headlines and coverage from a diverse range of sources across the political spectrum – left, center, and right-leaning – presenting them side-by-side for easy analysis. This tool is invaluable for anyone seeking a more nuanced understanding of current events, including students, educators, researchers, and general news consumers. By highlighting variations in reporting and potential biases, Bricka empowers users to develop critical thinking skills, make informed judgments, and gain a comprehensive perspective on complex issues, fostering a more informed public discourse.
1k desktop beats vendor sparse library 474× on Mistral-7B
ROLV Primitive is a groundbreaking software primitive designed to dramatically accelerate AI inference, achieving speedups of up to 106 times and reducing energy consumption by 99%. Unlike standard libraries that compute every element, ROLV identifies and processes only the mathematically non-zero portions of weight matrices at load time. This three-phase operation involves analysis, compute, and assembly, ensuring bit-identical results to full computation without any accuracy trade-off. It outperforms vendor sparse libraries like cuSPARSE by exploiting the structured sparsity of AI weight matrices, leading to better cache utilization and full tensor core throughput. ROLV is platform-agnostic, compatible with NVIDIA, AMD, Intel, ARM, Apple Silicon, Google TPU, and custom ASICs, and supports major frameworks like PyTorch, JAX, and TensorFlow.
Temporal Technologies
Temporal Technologies provides an open-source durable execution platform designed to build invincible applications that never lose state, even when underlying systems fail. It allows developers to write business logic using native SDKs in popular programming languages, eliminating the need for complex reconciliation logic. Temporal Workflows automatically capture state at every step, enabling applications to pick up exactly where they left off after any interruption. The platform supports long-running workflows, handles failure-prone logic with automatic retries, and replaces brittle state machines with a robust, fault-tolerant service. Users can host the Temporal Service themselves or utilize Temporal Cloud for a managed solution, gaining full visibility into workflow executions without sifting through logs.
Hippo AI Foundation
The Hippo AI Foundation is a non-profit organization dedicated to open-sourcing medical knowledge to advance AI-based healthcare. Its core mission is to establish AI healthcare as a common good, moving away from the current trend of privatizing medical knowledge. The foundation actively supports the development and deployment of AI technologies that benefit all, ensuring that advancements in medical AI are accessible and contribute to public welfare rather than being confined to proprietary systems. This initiative seeks to democratize access to cutting-edge healthcare solutions powered by artificial intelligence.
Lobe
Lobe offers a free, easy-to-use tool for Mac and PC that enables users to train custom machine learning models by providing examples. While the desktop application is no longer under active development, the project provides various open-source repositories to support developers. These include a Python toolset for working with Lobe models, iOS and web starter projects for integrating trained models into applications, and tools for creating image-based datasets. The project also includes a kit in partnership with Adafruit for bringing machine learning ideas to life, making it a valuable resource for developers looking to implement custom ML solutions.
Fiddler AI
Fiddler AI provides an AI Control Plane designed for enterprise agents, offering comprehensive observability, security, and governance across the entire agentic lifecycle. The platform features Agentic Observability for end-to-end visibility and control, Fiddler Trust Service for secure in-environment evaluation and guardrails, and industry-leading guardrails to protect agentic applications. It also supports AI Governance, Risk Management, and Compliance, helping enterprises mitigate bias and build responsible AI cultures. Fiddler AI integrates with major platforms like Amazon SageMaker, Google Cloud Vertex AI, NVIDIA NIM, Databricks, and Datadog, ensuring high-performing AI solutions at scale.
Picsellia
Picsellia is an end-to-end MLOps platform specifically designed for computer vision applications. It provides a comprehensive solution for managing the entire lifecycle of vision AI, from data collection and organization to model deployment and monitoring. Key features include a Datalake for centralizing visual data, advanced annotation tools with AI assistance, experiment tracking for model training, and robust deployment options. The platform supports various industries like manufacturing, agriculture, and energy, enabling teams to build and scale AI applications efficiently. Picsellia is ISO 27001 certified and offers flexible deployment options including cloud, on-premise, and hybrid configurations.
dcgan-completion.tensorflow
dcgan-completion.tensorflow is an open-source project for image completion using deep learning, built on TensorFlow. It specifically implements the techniques described in Raymond Yeh and Chen Chen et al.'s paper, "Semantic Image Inpainting with Perceptual and Contextual Losses." The tool is primarily a modification of Taehoon Kim's DCGAN-tensorflow project, sharing its MIT license. It includes a pre-trained model for faces, trained on the CelebA dataset, making it ready for immediate use in specific image completion tasks. This repository is ideal for researchers and developers interested in exploring or applying deep learning for image inpainting.
3D-convolutional-speaker-recognition
3D-convolutional-speaker-recognition is an open-source project providing a TensorFlow implementation of 3D Convolutional Neural Networks for text-independent speaker verification. The project leverages a 3D convolutional architecture to simultaneously capture speech-related and temporal information from speaker utterances, leading to more robust speaker models. It outlines a three-phase Speaker Verification Protocol (SVP) including development, enrollment, and evaluation stages. A key differentiator is its approach to direct speaker model creation, which is shown to significantly outperform traditional d-vector verification systems. The code uses MFECs (Mel-Frequency Energy Coefficients) as input features, discarding the DCT operation of MFCCs to preserve locality for convolutional operations. The implementation details for the 3D convolutional operations using TensorFlow Slim are provided, making it a valuable resource for researchers and developers in the field.
llama2-webui
llama2-webui is an open-source tool designed for running Llama 2 models locally through a Gradio web UI. It offers broad compatibility, supporting all Llama 2 models (7B, 13B, 70B, GPTQ, GGML, GGUF, CodeLlama) and various backends like transformers, bitsandbytes (8-bit inference), AutoGPTQ (4-bit inference), and llama.cpp. The tool can be deployed on Linux, Windows, and Mac, utilizing either GPU or CPU resources. Developers can also leverage `llama2-wrapper` as a local Llama 2 backend for generative agents and applications, and it provides an OpenAI-compatible API for seamless integration with existing clients and libraries. Benchmarking scripts are included to evaluate performance on different devices.
kubewall
kubewall is an open-source, single-binary Kubernetes dashboard designed for multi-cluster management with integrated AI capabilities. It offers a rich, real-time interface for managing and investigating Kubernetes clusters, providing features like live views of cluster resources, pods, and services. The AI integration leverages models such as OpenAI, Claude 4, Gemini, DeepSeek, OpenRouter, Ollama, Qwen, and LMStudio for automated troubleshooting, configuration optimization, and smart recommendations. It supports effortless installation as a lightweight binary on Mac, Windows, or Linux, with no dependencies. Users can access it securely via any browser, with options for HTTPS setup, and benefit from in-depth resource views, powerful search and filtering, and privacy by design with zero cloud dependency. It also includes port forwarding, live refresh, and aggregated pod logs for efficient debugging and monitoring.