ShypdShypd.ai
💻

Coding & Development

Browsing page 76 of AI tools for Open Source & Models in Coding & Development. Sorted by confidence score — our independent quality rating.

Hands-On-Machine-Learning-with-CPP

Hands-On-Machine-Learning-with-CPP

60%

Hands-On-Machine-Learning-with-CPP is a comprehensive code repository accompanying a Packt publication, designed to guide users through implementing various machine learning and deep learning algorithms using C++. It covers fundamental to advanced concepts, offering practical, easy-to-follow examples. Users will learn to preprocess diverse data types, employ key machine learning algorithms with C++ libraries, and optimize models using grid-search. The repository also includes methods for anomaly detection, improving collaborative filtering, and managing model structures. It provides a C++ program for image classification tasks with LeNet architecture, making it suitable for data analysts, data scientists, and machine learning developers looking to implement models in production.

MOFA-Video

MOFA-Video

60%

MOFA-Video is an open-source project presented at ECCV 2024, designed for controllable image animation. It leverages generative motion field adaptions within a frozen image-to-video diffusion model to animate still images. The tool supports diverse control signals, including trajectories, keypoint sequences, and hybrid combinations, allowing for precise manipulation of motion. It features a sparse-to-dense motion generation approach and flow-based motion adaptation. MOFA-Video provides training scripts for trajectory-based and keypoint-based facial image animation, along with Gradio inference code and checkpoints for hybrid controls. This makes it a powerful resource for researchers and developers interested in advanced video generation techniques.

graphrag-local-ollama

graphrag-local-ollama

60%

GraphRAG Local Ollama is an open-source adaptation of Microsoft's GraphRAG, designed to leverage local models via Ollama for LLM and embedding extraction. This tool eliminates the dependency on costly OpenAPI models, offering a cost-effective solution for knowledge graph implementations. It supports a variety of local models such as Llama3, Mistral, Gemma2, and Phi3, and integrates with Ollama for both language models and embedding models like nomic-embed-text. The setup process is straightforward, involving conda environment creation, Ollama installation, repository cloning, and specific `pip install` commands. Users can easily configure models and run indexing and querying operations, with options to visualize generated graphs using tools like Gephi or a provided Python script.

ICEdit

ICEdit

60%

ICEdit is an innovative open-source image editing tool that leverages a single LoRA (Low-Rank Adaptation) to achieve state-of-the-art instruction-based editing. It stands out by requiring only 0.5% of the training data and 1% of the parameters compared to prior SOTA methods, yet delivers fantastic image editing results. A key differentiator is its superior ID persistence, even surpassing models like GPT-4o. The tool is highly accessible, needing only 4GB VRAM to run, making it suitable for a wider range of hardware. ICEdit supports multi-turn and single-turn edits with high precision and offers various integration options, including official ComfyUI workflows and a Gradio demo for user-friendly interaction. It also provides training code for users to create their own editing LoRAs.

IDM-VTON

IDM-VTON

60%

IDM-VTON is an open-source project that implements a novel approach to improving diffusion models for authentic virtual try-on in the wild. Based on research presented at ECCV 2024, this tool allows users to generate realistic virtual try-on images by integrating advanced diffusion techniques. It supports datasets like VITON-HD and DressCode, offering functionalities for both training and inference. The project provides detailed instructions for data preparation, model training, and running local Gradio demos, making it accessible for researchers and developers interested in virtual try-on technology.

Open LLM Leaderboard

Open LLM Leaderboard

60%

Open LLM Leaderboard is a comprehensive platform designed for tracking, ranking, and evaluating open-source large language models (LLMs) and chatbots. Hosted on Hugging Face Spaces, this tool provides a centralized hub for comparing the performance of different models across a range of standardized tests. Users can explore benchmarks such as IFEval, BBH, MATH, GPQA, MUSR, and MMLU-PRO, gaining insights into how various LLMs stack up against each other. The platform is particularly valuable for AI researchers and practitioners who need to assess model capabilities, identify top-performing models, and stay updated on the latest advancements in the open-source LLM landscape. While the live website content indicates a runtime error, the tool's core purpose is to offer transparent and data-driven evaluations.

guidellm

guidellm

60%

Guidellm is an open-source platform designed for evaluating and enhancing Large Language Model (LLM) deployments, focusing on real-world inference needs. It simulates end-to-end interactions with OpenAI-compatible and vLLM-native servers, generating workload patterns that reflect production usage. The platform produces detailed reports to help teams understand system behavior, resource needs, and operational limits. Guidellm supports both real and synthetic multimodal datasets, including text, image, audio, and video inputs, and offers flexible execution profiles. It provides SLO-aware benchmarking, capturing complete latency and token-level statistics for metrics like TTFT, ITL, and end-to-end behavior, ensuring consistent assessment of model performance, tuning deployments, and capacity planning.

infiAgent

infiAgent

60%

infiAgent, also known as MLA (Multi-Level Agent), is an open-source agent framework designed for handling long-running, complex tasks without issues like tool calling chaos or system crashes due to cumulative task resources and conversation history. It enables users to build powerful general-purpose and semi-specialized agents by simply editing configuration files. Key features include support for days-long complex tasks with full recovery from interruptions, compatibility with the Agent Skills open standard for dynamic skill loading, and a flexible architecture supporting both multi-level hierarchy and flat designs. The framework utilizes a file-directory-based memory system for persistent memory across sessions, eliminating the need for external databases. It also offers a Docker-based Web UI for multi-user registration and account management, and supports multi-provider model configurations for fine-grained cost control.

InstaFlow

InstaFlow

60%

InstaFlow is an ultra-fast, one-step image generator that leverages Rectified Flow technique to achieve image quality comparable to Stable Diffusion while significantly reducing computational demands. It offers ultra-fast inference, generating images in approximately 0.1 seconds on an A100 GPU, saving about 90% of the inference time compared to original Stable Diffusion. InstaFlow generates high-quality images with intricate details and is compatible with pre-trained LoRAs and ControlNets. The training process is simple and efficient, involving supervised training and taking 199 A100 GPU days to train InstaFlow-0.9B. The tool provides code, pre-trained models, and a Hugging Face demo for easy access.

improved-diffusion

improved-diffusion

60%

Improved-diffusion is an open-source codebase developed by OpenAI for working with Improved Denoising Diffusion Probabilistic Models. This repository provides the necessary tools and scripts for researchers and developers to train and sample from these powerful generative AI models. Users can prepare their own image datasets, including options for class-conditional training by naming files with labels. The codebase supports various hyperparameters for model architecture, diffusion processes, and training flags, allowing for flexible experimentation. It also facilitates distributed training across multiple GPUs and offers different sampling strategies, including DDIM. Pre-trained model checkpoints and their corresponding hyperparameters are provided for several common tasks, such as unconditional ImageNet-64 and CIFAR-10 generation, class-conditional ImageNet-64, and LSUN bedroom models.

IntroNeuralNetworks

IntroNeuralNetworks

60%

IntroNeuralNetworks is an open-source Python project designed to introduce beginners to neural networks and demonstrate their application in stock price prediction. It guides users through the entire machine learning workflow, from data acquisition and preprocessing to model training and backtesting. The project includes implementations of Multilayer Perceptron (MLP) and Long Short-Term Memory (LSTM) models, explaining their relevance for time-series data like stock prices. While not intended for live trading, it serves as an educational template for understanding neural network fundamentals and can be extended for more sophisticated trading strategies. The project emphasizes the importance of data quality and provides a clear, step-by-step approach to building and evaluating predictive models.

IsaacLab

IsaacLab

60%

Isaac Lab is a GPU-accelerated, open-source framework designed to unify and simplify robotics research workflows, including reinforcement learning, imitation learning, and motion planning. Built on NVIDIA Isaac Sim, it combines fast and accurate physics and sensor simulation, making it an ideal choice for sim-to-real transfer in robotics. The framework provides developers with essential features for accurate sensor simulation, such as RTX-based cameras, LIDAR, and contact sensors. Its GPU acceleration enables faster complex simulations and computations, crucial for iterative processes like reinforcement learning. Isaac Lab supports over 16 robot models and more than 30 ready-to-train environments, compatible with popular reinforcement learning frameworks like RSL RL, SKRL, RL Games, and Stable Baselines. It can run locally or be distributed across the cloud, offering flexibility for large-scale deployments.

ix

ix

60%

ix is an autonomous GPT-4 agent platform designed for building and deploying AI-powered agents and workflows. It offers a flexible and scalable solution for delegating tasks to AI agents, enabling them to automate a wide variety of tasks, run in parallel, and communicate with each other. Key features include a no-code agent editor for creating and testing agents with a visual graph interface, a multi-agent chat interface for interacting with teams of agents, and smart input with auto-completion. The platform supports various models like OpenAI, Google PaLM, Anthropic, and Llama. Its backend is dockerized and uses a Celery message queue for horizontal scaling of agent workers, making it suitable for complex and demanding AI applications.

tf_geometric

tf_geometric

60%

tf_geometric is a Graph Neural Network (GNN) library designed for TensorFlow 1.x and 2.x, offering an efficient and user-friendly approach to deep learning on graphs. Inspired by PyTorch Geometric, it implements GNNs using a Message Passing mechanism, which is noted for being more efficient than dense matrix-based implementations and more accessible than sparse matrix-based ones. The library provides intuitive APIs for constructing graphs, applying various GNN layers like GAT and GCN, and handling batch processing of graphs. It also includes built-in datasets such as Cora, PPI, and TU Datasets, and supports both OOP and Functional API styles for flexibility in model development. Users can install it with specific TensorFlow CPU or GPU versions.

leon

leon

60%

Leon is an open-source personal AI assistant built around tools, context, memory, and agentic execution. Designed for practicality and privacy, it can operate locally, leveraging dedicated tools instead of relying on free-form guessing to complete tasks. Leon supports both deterministic workflows and agent-style execution, allowing it to understand goals, choose how to handle them, and recover from errors. It integrates with local and remote AI providers, balancing privacy, control, and capability. The core architecture organizes capabilities into Skills, Actions, Tools, and Functions, with a compact self-model and proactive pulse system for consistency. It's ideal for users who prioritize privacy and grounded, extensible AI assistance.

LLaVA

LLaVA

60%

LLaVA (Large Language and Vision Assistant) is an open-source project focused on visual instruction tuning to develop large language and vision models with capabilities comparable to GPT-4. It offers improved baselines and supports community contributions, making it a robust platform for multimodal AI research and development. Recent releases include LLaVA-NeXT models with support for LLaMA-3 and Qwen-1.5, LLaVA-NeXT (Video) for zero-shot modality transfer, and LMMs-Eval for efficient evaluation of Large Multimodal Models. The project also provides LLaVA-Plus for multimodal agents and LLaVA-Interactive for human-AI multimodal interaction, including image chat, segmentation, generation, and editing. LLaVA supports LoRA finetuning for reduced GPU RAM and offers various model checkpoints through its Model Zoo.

machinelearning-samples

machinelearning-samples

60%

machinelearning-samples is a GitHub repository offering a comprehensive collection of samples for ML.NET, an open-source and cross-platform machine learning framework designed for .NET developers. The repository aims to make machine learning accessible by providing practical examples for various ML tasks, including binary classification, multi-class classification, recommendation, regression, anomaly detection, clustering, ranking, and computer vision. It features both getting started code-focused samples and end-to-end applications, such as web and desktop apps infused with ML.NET models. Additionally, it includes samples for automating ML.NET model generation through CLI and AutoML APIs, simplifying the process of creating high-quality models without extensive manual coding.

Long-Context

Long-Context

60%

Long-Context is an open-source repository from Abacus.AI designed to provide code and tooling for Large Language Model (LLM) context expansion. It offers a comprehensive suite of evaluation scripts and benchmark tasks specifically tailored to assess a model’s information retrieval capabilities within expanded contexts. The repository details various experimental results, including different positional encoding schemes like linear scaling and fine-tuning approaches, and provides instructions for reproducing and building upon these findings. It also shares weights for best-performing models, such as the scale 16 model, which is expected to perform well up to 16k context lengths. The project includes novel evaluation datasets like an extended LMSys dataset and WikiQA (Free Form QA and Altered Numeric QA) to rigorously test models across varying context lengths and answer locations, addressing potential issues like models answering from pre-trained knowledge rather than provided context.

OpenFlowKit

OpenFlowKit

60%

OpenFlowKit is a free, open-source, local-first AI diagramming tool designed for engineers, architects, technical founders, and product teams. It allows users to create architecture diagrams, flowcharts, and system designs with AI assistance, offering editable exports rather than static images. The tool supports various input methods, including pasting JSON, React components, Prisma schemas, or SQL dumps, which its AI engine parses to build living canvases instantly. Key features include a cinematic export engine for presentation-ready animations, diagram-as-code capabilities, and an AI assistant for drafting and refining diagrams. OpenFlowKit emphasizes privacy with local storage and the option to bring your own API key for AI functionalities. It also offers seamless integration with Figma for editable vector exports and supports multiplayer collaboration.

magenta-js

magenta-js

60%

Magenta.js is a collection of TypeScript libraries designed for integrating machine learning-powered music and art generation directly into web browsers. It allows developers to leverage pre-trained Magenta models for various creative applications. The libraries are published as npm packages, making them easily accessible for web development projects. Key components include `music` for note-based models like MusicVAE and MelodyRNN, `sketch` for models such as SketchRNN, and `image` for image models like Arbitrary Style Transfer. This tool is ideal for developers and content creators looking to build interactive, AI-driven musical and artistic experiences on the web.

MiniGPT4-video

MiniGPT4-video

60%

MiniGPT4-video offers official code for the Goldfish model, designed for understanding arbitrarily long videos, and MiniGPT4-video itself, tailored for short video understanding. This tool advances multimodal Large Language Models (LLMs) by integrating visual and textual tokens for comprehensive video analysis. Goldfish addresses challenges in long video processing through an efficient retrieval mechanism that identifies relevant video clips, making it suitable for applications like movies or TV series. MiniGPT4-video generates detailed descriptions for video clips, facilitating the retrieval process for Goldfish. The project also introduces the TVQA-long benchmark for evaluating long video comprehension and demonstrates significant performance improvements over existing state-of-the-art methods in both long and short video understanding.

ml-cvnets

ml-cvnets

60%

ml-cvnets is a comprehensive computer vision toolkit developed by Apple, designed for researchers and engineers to efficiently train a wide array of computer vision models. It supports both standard and novel mobile- and non-mobile architectures for tasks such as object classification, object detection, semantic segmentation, and foundation models like CLIP. The library is built on Python 3.10+ and PyTorch, offering features like automatic data augmentation (RangeAugment, AutoAugment, RandAugment) and enhanced distillation support. It includes a model zoo with various CNNs (MobileNet, EfficientNet, ResNet) and Transformers (Vision Transformer, MobileViT, SwinTransformer), making it a versatile platform for advanced computer vision research and development.

mlimpl

mlimpl

60%

mlimpl is an open-source repository collecting implementations of commonly used machine learning algorithms. It encompasses various domains including statistical learning, deep learning, and reinforcement learning. The implementations are primarily built using popular Python libraries such as NumPy, Pandas, and PyTorch, with some TensorFlow and MATLAB examples. This resource is designed to help users deepen their understanding of machine learning models and algorithms, offering well-documented code and guidance for challenging parts. Users can also modify the code to suit their specific needs, making it a flexible tool for both learning and practical application.

node

node

60%

Node provides a supplementary code for Neural Oblivious Decision Ensembles, designed for deep learning on tabular data. This tool specializes in learning deep ensembles of oblivious differentiable decision trees, offering a robust approach to data analysis. While it can run on CPU, optimal performance is achieved with a GPU, which significantly reduces processing time. The implementation is noted to be memory inefficient, potentially requiring substantial GPU memory. It is compatible with popular Linux x64 distributions and MacOS, with Docker recommended for other systems. Users need Python (Anaconda recommended) and specific Torch versions to run the provided notebooks, which showcase classification and regression scenarios.