ShypdShypd.ai
💻

Coding & Development

Browsing page 121 of AI tools for Open Source & Models in Coding & Development. Sorted by confidence score — our independent quality rating.

PipeCNN

PipeCNN

59%

PipeCNN is an OpenCL-based FPGA Accelerator specifically designed for large-scale Convolutional Neural Networks (CNNs). It leverages High Level Synthesis (HLS) tools to facilitate the design and implementation of customized circuits on FPGAs, significantly speeding up the hardware development cycle compared to traditional RTL-based methodologies. The project provides a generic, yet efficient, OpenCL-based CNN accelerator that is scalable in both performance and hardware resources, making it suitable for various FPGA platforms. PipeCNN supports both Intel OpenCL SDK and Xilinx Vitis based FPGA design flows and includes a ModelZoo with pre-quantized models for networks like VGG-16 and ResNet-50. While the performance may not match the latest state-of-the-art designs, PipeCNN serves as a complete and valuable resource for learning about Deep Learning Architecture (DLA) and experimenting with new ideas in FPGA acceleration.

dilation

dilation

59%

Dilation is an open-source project that implements dilated convolution for semantic image segmentation. It focuses on multi-scale context aggregation, a technique detailed in its ICLR 2016 conference paper. The repository includes network definitions and pre-trained models, allowing users to segment images using vanilla Caffe. For those interested in training their own models, comprehensive documentation is provided. The project also highlights that dilated convolution is implemented in other deep learning packages like Torch and Lasagne, offering flexibility for developers. It serves as a foundational resource for researchers and developers working on advanced image segmentation tasks.

sisi

sisi

59%

sisi is a free, open-source command-line interface (CLI) tool designed for semantic image search. It enables users to perform image searches locally on their machines, eliminating the need for external APIs. The tool is powered by node-mlx, a machine learning framework built for Node.js, and leverages the CLIP model to compute image embeddings. sisi supports Macs with Apple Silicon and x64/arm64 Linux, though Windows support is not yet available. It allows users to build and update image indexes for specified directories, list indexed directories, remove indexes, and search for images using natural language queries or image URLs/local files. The indexing process can be time-consuming for large collections without GPU support, but subsequent updates are faster as it only processes new or modified files.

TokenFormer

TokenFormer

59%

TokenFormer is the official implementation of the ICLR2025 Spotlight paper, "TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters." This tool introduces a fully attention-based neural network that unifies token-token and token-parameter interactions, maximizing the flexibility of neural network architectures. By tokenizing both data and model parameters, TokenFormer inherently enhances model scalability, allowing for progressively efficient scaling. The architecture is designed to be natively scalable, leveraging attention mechanisms for interactions between input tokens, and between tokens and model parameters. This approach aims to offer greater flexibility than traditional Transformers, contributing to advancements in foundation models, sparse inference (MoE), parameter-efficient tuning, device-cloud collaboration, and vision-language applications.

Transformer-TTS

Transformer-TTS

59%

Transformer-TTS is a PyTorch implementation of the "Neural Speech Synthesis with Transformer Network," designed for efficient and high-quality speech synthesis. This model boasts training speeds 3 to 4 times faster than well-known seq2seq models such as Tacotron, while maintaining comparable synthesized speech quality. It utilizes a post-network based on the CBHG model from Tacotron and converts spectrograms into raw audio waves using the Griffin-Lim algorithm. The project includes detailed instructions for data preparation, training the autoregressive attention network and post-network, and generating TTS samples, making it a valuable resource for researchers and developers in speech synthesis.

vlmrun-hub

vlmrun-hub

59%

vlmrun-hub is a comprehensive, open-source repository offering pre-defined Pydantic schemas specifically designed for extracting structured data from unstructured visual domains like images, videos, and documents. It is built for Vision Language Models (VLMs) and optimized for real-world use cases, simplifying the integration of visual ETL into various workflows. The hub addresses the common challenge of VLMs lacking strongly-typed, validated outputs for automation by providing schemas that ensure data conforms to expected types and structures, eliminating complex parsing and validation. Key benefits include ease of use, automatic data validation, type-safety, model-agnostic compatibility, and optimization for visual ETL across industries such as healthcare, finance, and retail.

TTSR

TTSR

59%

TTSR (Texture Transformer Network for Image Super-Resolution) is an official PyTorch implementation of a CVPR 2020 paper, designed to significantly enhance image resolution. Unlike traditional single image super-resolution (SISR) methods, TTSR leverages an additional high-resolution reference image to extract and utilize texture information, leading to superior results. It introduces a novel texture transformer architecture with four closely-related modules, making it one of the first to apply transformer networks to image generation tasks. The tool also features a cross-scale feature integration module for more powerful feature representation, making it ideal for researchers and developers in computer vision working on image enhancement.

YuE

YuE

59%

YuE is a groundbreaking series of open-source foundation models designed for music generation, specifically for transforming lyrics into full songs (lyrics2song). It can generate complete songs, lasting several minutes, that include both a catchy vocal track and an accompaniment track. YuE is capable of modeling diverse genres, languages (English, Mandarin Chinese, Cantonese, Japanese, Korean), and vocal techniques. It supports features like LoRA finetuning, incremental song generation, music continuation, and dual-track in-context learning (ICL) where a reference song's style can be adopted. The model is licensed under Apache 2.0, encouraging artists to use and monetize generated outputs with attribution.

alluxio

alluxio

59%

Alluxio Open Source is a Distributed Caching Platform designed for large-scale data, specifically for analytics workloads. It acts as a data orchestration layer, allowing computation applications to connect to various storage systems through a common interface. Originating from UC Berkeley's AMPLab, Alluxio accelerates structured data analytics and is widely adopted with engines like Presto, Spark, and Trino. While the open-source edition is suitable for testing and small-scale production, the Enterprise Edition offers a decentralized metadata service for AI/ML workloads, supporting billions of files and providing FUSE-based POSIX integration for frameworks like PyTorch and TensorFlow.

chat-with-mlx

chat-with-mlx

59%

chat-with-mlx provides an all-in-one chat playground for Large Language Models (LLMs) specifically designed for Apple Silicon Macs, utilizing the MLX Framework. It prioritizes privacy by allowing users to chat with their favorite models and data securely on their local device. The tool offers easy integration with HuggingFace and MLX Compatible Open-Source Models, including popular options like Llama-3, Phi-3, Yi, Qwen, Mistral, Codestral, Mixtral, and StableLM. Installation is straightforward via pip or Conda, making it accessible for developers and enthusiasts. It features a unified memory model and dynamic graph construction, characteristic of the MLX framework, ensuring efficient performance without data transfers between CPU and GPU.

csghub-server

csghub-server

59%

csghub-server is the open-source backend server for CSGHub, a platform designed for managing large model assets. It facilitates the management of models, datasets, and other LLM assets through a robust REST API. Key features include the creation and management of users and organizations, automatic tagging of models and datasets, and comprehensive search functionalities. Users can also preview dataset files online, download individual files including LFS files, and track activity data like downloads and likes. The server supports extensible and customizable architectures, allowing integration with various Git servers and flexible configuration of LFS storage systems. It also enables on-demand content moderation and has a roadmap for supporting more Git servers, Git LFS, dataset online viewers, and model/dataset auto-tagging.

DataDesigner

DataDesigner

59%

DataDesigner is an open-source library developed by NVIDIA NeMo for generating high-quality synthetic datasets. It allows users to create diverse data from scratch or by leveraging existing seed datasets, going beyond simple LLM prompting. The tool provides a flexible framework for building production-grade synthetic data, enabling control over relationships between fields with dependency-aware generation. It includes built-in Python, SQL, and custom local/remote validators for quality assurance, and can score outputs using LLM-as-a-judge. DataDesigner also offers a preview mode for quick iteration before full-scale generation and supports agent-assisted development, particularly with Claude Code, for schema design and generation.

deep-learning-frameworks

deep-learning-frameworks

59%

deep-learning-frameworks offers installation support for a broad collection of deep learning and machine learning components, such as PyTorch, transformers, Fast.ai, and scikit-learn, specifically tailored for the ArcGIS System. This includes ArcGIS Pro, Server, and the ArcGIS API for Python. The tool facilitates AI and deep learning applications for geospatial problems like feature extraction, pixel classification, and feature categorization. It simplifies the setup process by installing 254 packages into the default arcgispro-py3 Python environment. While most tools work on any machine, common deep learning workflows benefit significantly from an NVIDIA GPU with CUDA Compute Capability 5.0+ and 8GB+ dedicated graphics memory. The project provides installers for various ArcGIS versions and detailed instructions for both Windows and Linux environments, including manual installation options and support for disconnected environments.

Difix3D

Difix3D

59%

Difix3D is an open-source project designed to enhance 3D reconstructions by leveraging single-step diffusion models. It offers a comprehensive framework for improving the quality of 3D data, specifically targeting artifact removal and the refinement of novel views. The tool provides both Difix for single-step diffusion artifact removal and Difix3D for progressive 3D updates, including integration with popular 3D reconstruction frameworks like Nerfstudio and gsplat. Additionally, Difix3D+ introduces real-time post-rendering capabilities to further sharpen details and improve visual fidelity. This makes it a valuable resource for researchers and developers working on advanced 3D computer vision tasks, offering practical implementations and models for immediate use.

GenerativeImage2Text

GenerativeImage2Text

59%

GenerativeImage2Text (GIT) is a repository from Microsoft that provides code examples and pre-trained models for generating text from images. It leverages a Generative Image-to-text Transformer for various vision and language tasks. Users can perform image captioning, where the model describes the content of an image, or visual question answering, where the model answers questions about an image. The tool supports inference on single images, multiple frames (for video analysis), and TSV files containing collections of images. It offers different model sizes (base and large) and fine-tuned versions for specific datasets like COCO, VQAv2, and TextCaps, allowing for tailored performance across diverse applications.

GraphWaveletNeuralNetwork

GraphWaveletNeuralNetwork

59%

GraphWaveletNeuralNetwork is an open-source PyTorch implementation of the "Graph Wavelet Neural Network" (GWNN) as presented at ICLR 2019. This novel graph convolutional neural network addresses limitations of previous spectral graph CNN methods by utilizing graph wavelet transform, which avoids computationally expensive matrix eigendecomposition. The graph wavelets are sparse and localized, enhancing efficiency and interpretability for graph convolution tasks. The tool is designed for researchers and machine learning engineers working with graph-based semi-supervised classification, demonstrating superior performance on benchmark datasets like Cora, Citeseer, and Pubmed. It includes command-line arguments for easy configuration of training parameters and model options.

guess

guess

59%

Guess.js is an open-source library offering tools and libraries to enable data-driven user experiences on the web, primarily focusing on predictive prefetching and bundling. It leverages data from sources like Google Analytics to predict user navigation patterns, allowing for prefetching of likely next pages or associated bundles. This approach aims to significantly improve perceived page load performance and user satisfaction. The library offers a Webpack plugin for automated setup for Webpack users, and provides modules for fetching Google Analytics data, JavaScript framework parsing, and configuring predictive fetching. For non-Webpack users, it outlines a workflow for integrating predictive fetching using the Google Analytics API and a client-side script.

ImageCaptioning.pytorch

ImageCaptioning.pytorch

59%

ImageCaptioning.pytorch is a comprehensive open-source codebase designed for advanced image captioning research. It offers robust support for self-critical training, a technique crucial for optimizing caption generation. Researchers can leverage bottom-up features for more detailed image understanding and utilize multi-GPU training for efficient model development, including DistributedDataParallel with pytorch-lightning. The codebase also supports Transformer captioning models, providing a flexible framework for experimenting with state-of-the-art architectures. It includes functionalities for evaluating models on various datasets like COCO and Flickr30k, generating captions for raw images, and performing beam search for improved decoding. With detailed instructions for installation, data preparation, and training, it serves as a valuable resource for academics and developers in the field of computer vision and natural language processing.

kornia

kornia

59%

Kornia is a differentiable computer vision library built on PyTorch, designed for spatial AI applications. It offers a comprehensive suite of differentiable image processing and geometric vision algorithms, allowing users to leverage powerful batch transformations, auto-differentiation, and GPU acceleration. Key features include a wide range of image processing operators like filters, transformations, and enhancements, as well as advanced augmentation pipelines for training AI models. Kornia also provides access to pre-trained AI models for tasks such as face detection, feature matching, segmentation, and classification. The library is expanding its focus towards end-to-end vision models, with a particular emphasis on integrating state-of-the-art Vision Language Models (VLM) and Vision Language Agents (VLA). It supports multi-framework usage, including TensorFlow, JAX, and NumPy, making it a versatile tool for developers and researchers in the AI and computer vision fields.

image_captioning

image_captioning

59%

image_captioning is an open-source TensorFlow implementation of a neural image caption generation system, based on the "Show, Attend and Tell" paper. This tool takes an image as input and outputs a descriptive sentence. It leverages a convolutional neural network (CNN) to extract visual features from the image, which are then decoded into a sentence by an LSTM recurrent neural network (RNN). A soft attention mechanism is integrated to enhance the quality and relevance of the generated captions. The project supports end-to-end training of both CNN and RNN components, allowing for fine-tuning with datasets like COCO train2014. Users can evaluate models, generate captions for new images, and monitor training progress with TensorBoard.

LLMRec

LLMRec

59%

LLMRec is a novel framework implemented in PyTorch, designed to significantly improve recommendation systems through the application of three distinct LLM-based graph augmentation strategies. These strategies include reinforcing user-item interactive edges, enhancing item node attributes, and conducting user node profiling, all from a natural language perspective. The tool leverages content within online platforms like Netflix and MovieLens to augment interaction graphs. It provides code, original data, and augmented data, making it a valuable resource for researchers and data scientists working on recommendation systems. LLMRec also offers multi-modal datasets, including textual and visual data, and supports LLM-augmented textual data and embeddings for comprehensive research.

lmnr

lmnr

59%

Laminar is an open-source observability platform specifically designed for AI agents, offering comprehensive tools for tracing, evaluations, and AI monitoring. It features an OpenTelemetry-native tracing SDK that requires only a single line of code to automatically trace popular AI frameworks like Vercel AI SDK, LangChain, OpenAI, Anthropic, and Gemini. The platform also includes an unopinionated, extensible SDK and CLI for running evaluations locally or in CI/CD pipelines, with a UI for visualizing and comparing results. Users can define events with natural language descriptions for AI monitoring, track issues, logical errors, and custom agent behavior. All data is accessible via SQL, allowing for querying traces, metrics, and events, bulk dataset creation, and custom dashboards. Laminar boasts extremely high performance, built with Rust, featuring a custom real-time engine for trace viewing and ultra-fast full-text search over span data.

NeuralPDE.jl

NeuralPDE.jl

59%

NeuralPDE.jl is an open-source solver package designed for Scientific Machine Learning (SciML) that utilizes Physics-Informed Neural Networks (PINNs) to solve various types of differential equations, including Ordinary, Stochastic, and Partial Differential Equations (ODE, SDE, PDE). It offers a greatly increased generality compared to classical methods by leveraging neural stochastic differential equations. Key features include automated construction of physics-informed loss functions from a high-level symbolic interface, compatibility with machine learning libraries like Flux.jl and Lux.jl for GPU-powered layers, and integration with NeuralOperators.jl for mixing deep neural operators with physics-informed loss functions. The tool also supports advanced techniques such as quadrature training strategies, adaptive loss functions, and neural adapters to accelerate training, making it suitable for complex scientific simulations and data fitting.

natasha

natasha

59%

Natasha is a powerful open-source Python library designed to solve basic NLP tasks specifically for the Russian language. It offers a comprehensive suite of functionalities including tokenization, sentence segmentation, word embedding, morphology tagging, lemmatization, phrase normalization, syntax parsing, NER tagging, and fact extraction. The library emphasizes production readiness, focusing on optimized model size, RAM usage, and performance, with models running efficiently on CPU using Numpy for inference. Natasha integrates several specialized libraries like Razdel for segmentation, Navec for compact Russian embeddings, Slovnet for deep-learning morphology, syntax, and NER, and Yargy for rule-based fact extraction. While its API may evolve, it provides a convenient unified interface for various Russian NLP tasks, with models primarily optimized for news articles.