Coding & Development
Browsing page 186 of AI tools for Open Source & Models in Coding & Development. Sorted by confidence score — our independent quality rating.
Number Recognizer
Number Recognizer is an AI tool hosted on Hugging Face that specializes in recognizing digits from images of house or door plates. Users can easily upload a picture containing a house or door number, select a preferred model checkpoint, and the application will quickly process the image to read the displayed digits. The tool then returns the recognized number as plain text, along with a status indicating the recognition outcome. This application is useful for tasks requiring automated number extraction from real-world images, offering a straightforward solution for digit recognition.
Skywork-R1V
Skywork-R1V is an advanced multimodal AI model series developed by Skywork AI, specializing in vision-language reasoning. The series includes both open-source versions with model weights and inference code, as well as closed-source offerings like Skywork-R1V4-Lite. These models deliver exceptional performance across vision understanding, code execution, and deep research tasks, featuring agentic capabilities. Key features include code execution for complex tasks, deep research integration with web search, multi-turn reasoning with tool usage, and streaming support for real-time responses. The models have demonstrated state-of-the-art performance on various multimodal benchmarks, particularly excelling in perception and deep research capabilities.
moondream
Moondream is a highly efficient, open-source vision language model developed by m87-labs. It stands out for its ability to perform complex image understanding tasks while maintaining a remarkably small footprint, making it versatile for deployment across various devices and platforms. The project offers two main variants: Moondream 2B, a 2-billion parameter model for general-purpose tasks such as captioning, visual question answering, and object detection, and Moondream 0.5B, a compact 500-million parameter model optimized for edge devices. This smaller variant is ideal for resource-constrained hardware, enabling efficient deployment without sacrificing impressive capabilities. Moondream can be run locally or in the cloud, with detailed instructions available on its Getting Started page, and even provides an example for running on Modal.
New-View-Synthesis
New-View-Synthesis is a comprehensive GitHub repository dedicated to collecting and organizing research papers focused on new view synthesis techniques. The repository serves as a valuable resource for researchers and academics, offering direct links to published papers (often via arXiv or PDF) and their corresponding code implementations. It is actively maintained, with daily updates to include the latest advancements and provide more detailed information about each paper. This makes it an essential tool for staying current with the rapidly evolving field of neural radiance fields and other view synthesis methodologies, facilitating research, development, and understanding of these complex topics.
Online-3D-BPP-DRL
Online-3D-BPP-DRL is an open-source project that provides the implementation of the paper "Online 3D Bin Packing with Constrained Deep Reinforcement Learning." This tool is designed for researchers and developers interested in optimizing 3D bin packing problems using AI. It allows users to train new models on randomly generated sequences or test existing models with various data sets. The repository includes code for user-study applications, multi-bin algorithms, and MCTS for comparison, offering a comprehensive environment for experimentation and development in this domain. Users can adjust network architectures and parameters to suit their specific needs, making it a flexible platform for advanced AI research in logistics and optimization.
Online-3D-BPP-PCT
Online-3D-BPP-PCT is an open-source tool that implements a method for efficient online 3D bin packing. It leverages deep reinforcement learning (DRL) on a hierarchical packing configuration tree to enhance the practical applicability of the online 3D Bin Packing Problem (BPP). This approach makes the DRL model adept at dealing with practical constraints and performing well even in continuous solution spaces. Key features include arbitrary container and item sizes, support for continuous online 3D-BPP, algorithms for approximating stability, and improved performance with complex constraints. It also offers more adequate heuristic baselines for domain development and stable training.
pytorch-pose
pytorch-pose is an open-source PyTorch toolkit designed for 2D single human pose estimation. It offers a comprehensive pipeline for training, inference, and evaluation, making it a valuable resource for researchers and developers in computer vision. The toolkit includes a robust dataloader with various data augmentation options, compatible with popular human pose databases such as MPII, LSP, and FLIC. Key features include multi-thread data loading, multi-GPU training support, a logger for tracking progress, and visualization of training and testing results. It is compatible with PyTorch 0.4.1/1.0 and provides detailed instructions for installation, data preparation, and usage, including testing with pre-trained models and evaluating PCKh@0.5 scores.
PyGCL
PyGCL is a PyTorch-based open-source library specifically designed for Graph Contrastive Learning (GCL). It provides a comprehensive framework for researchers and developers to implement and experiment with various GCL algorithms. The library features modularized GCL components, including graph augmentation techniques like Edge Adding, Feature Masking, and Node Dropping, as well as different contrasting architectures and modes (single-branch, dual-branch, bootstrapped, within-embedding). PyGCL also implements a variety of contrastive objectives such as InfoNCE, JSD, and Barlow Twins, alongside negative sampling strategies. It supports standardized evaluation with evaluators like Logistic Regression and SVM, and offers utilities for managing experiments, making it a valuable tool for advancing graph representation learning.
PMRF
PMRF (Posterior-Mean Rectified Flow) is an open-source implementation of a novel photo-realistic image restoration algorithm, presented at ICLR 2025. It provably approximates the optimal estimator that minimizes the Mean Squared Error (MSE) while maintaining a perfect perceptual quality constraint. The tool provides capabilities for blind face image restoration and controlled experiments, offering model checkpoints and test datasets for evaluation. It supports various architectures, including HDiT and UNet, and includes installation instructions for setting up a conda environment. PMRF is ideal for researchers and developers focused on advancing image restoration techniques.
SugarDB
SugarDB is a highly configurable, distributed, in-memory data store and cache implemented in Go. It serves as an embeddable library or an independent service, providing a rich set of data structures like Lists, Sets, Sorted Sets, and Hashes. Key features include TLS/mTLS support, replication using the RAFT algorithm for fault tolerance, and an ACL layer for authentication and authorization. SugarDB also offers a persistence layer with Append-Only files and snapshots for data recovery, along with key eviction policies and multi-database support. Its compatibility with existing Redis clients via RESP makes it a versatile solution for developers seeking a robust, in-memory data management system.
stream-lua-nginx-module
stream-lua-nginx-module is an open-source tool that integrates the Lua programming language directly into NGINX TCP/UDP servers. This module, a fundamental part of the OpenResty project, empowers developers to significantly extend and customize NGINX's capabilities using Lua scripts. It supports various NGINX stream phases, including preread, content, and log, allowing for dynamic request processing, custom load balancing, and advanced logging. Key features include the ability to define TCP servers, handle SSL/TLS connections, and interact with sockets directly within Lua code. It also ports many directives and API functions from ngx_http_lua, providing a familiar environment for those accustomed to NGINX's HTTP module.
street_gaussians
Street Gaussians is an open-source project presented at ECCV 2024, focusing on modeling dynamic urban scenes using Gaussian Splatting. This tool provides a framework for researchers and developers to reconstruct complex, moving urban environments from video data. It includes functionalities for data preparation, such as converting Waymo Open Dataset, generating LiDAR depth, and creating sky masks. Users can configure parameters based on 3D Gaussian Splatting, train models, render scenes, and visualize results. The project offers scripts for training and rendering on example and experimental Waymo scenes, making it a valuable resource for advancing research in dynamic 3D scene reconstruction.
TryDevUtils
TryDevUtils is a comprehensive, free, and open-source collection of developer utilities designed to streamline common development tasks. It provides essential tools for JWT decoding and encoding, JSON formatting and validation, UUID generation, Base64 conversion, timestamp/date conversion, text diffing, and cron parsing. Additionally, it includes features like a color converter, hash generator, and YAML validator. A key differentiator is its commitment to privacy: all processing occurs locally on the user's device, ensuring no data leaves your machine. TryDevUtils is highly accessible, available as a web application, a desktop application for macOS, Windows, and Linux, and a Chrome extension, catering to a wide range of developer workflows.
supersplat
SuperSplat is a free and open-source 3D Gaussian Splat Editor built on web technologies, allowing it to run directly in the browser without any downloads or installations. This tool enables users to inspect, edit, optimize, and publish 3D Gaussian Splats, making it accessible for various 3D modeling tasks. It supports local development with Node.js 18+ and offers localization capabilities, with currently supported languages available for translation. The project is actively maintained by an open-source community, providing a robust platform for working with 3D Gaussian Splats. A live version of the editor is available online, offering immediate access to its features.
SEAM
SEAM (Self-supervised Equivariant Attention Mechanism) is an open-source implementation designed for weakly supervised semantic segmentation. This tool addresses the challenge of generating accurate object masks from image-level supervision, a common limitation in advanced class activation map (CAM) solutions. SEAM introduces a self-supervised approach by enforcing consistency regularization on predicted CAMs across various transformed images, effectively narrowing the gap between full and weak supervisions. Additionally, it incorporates a pixel correlation module (PCM) to refine predictions by leveraging context appearance information and similar neighbors. Extensive experiments on the PASCAL VOC 2012 dataset demonstrate SEAM's superior performance compared to state-of-the-art methods using the same level of supervision, making it a valuable resource for AI researchers and computer vision engineers.
Shadowrocket-First
Shadowrocket-First is an open-source GitHub repository offering a comprehensive collection of configuration files, modules, rule sets, and custom themes for the Shadowrocket application. It enables users to fine-tune their network settings, manage traffic, and enhance the functionality of various applications through specialized modules. The repository includes modules for popular services like Talkatone, Emby, DeepSeek, Wi-Fi Calling, Spotify, YouTube, Bilibili, and more, often with editable parameters for personalized adjustments. It also provides community-sourced configurations, usage manuals, and tools for ad-blocking and traffic management, catering to both basic and advanced users looking to optimize their Shadowrocket experience.
Trading-Gym
Trading-Gym is an open-source project designed for the development and testing of reinforcement learning algorithms within the context of financial trading. It offers a flexible environment, currently featuring a SpreadTrading environment, which allows users to trade spreads based on bid and ask price time series for multiple products. A key feature is its generic data feeding mechanism, enabling users to create custom DataGenerators to input diverse price data. The environment's state includes prices, entry price, and position (long, short, or flat). Trading-Gym's API is inspired by OpenAI Gym, aiming for full compatibility to integrate as an additional OpenAI environment, making it accessible for researchers and developers familiar with the OpenAI Gym framework.
tinyos-main
tinyos-main is the main development repository for TinyOS, an open-source, BSD-licensed operating system specifically designed for low-power wireless devices. These devices are commonly used in sensor networks, ubiquitous computing, personal area networks, smart buildings, and smart meters. While the tinyos-main tree has seen less recent activity, active development has shifted to the tinyprod repository. The project is transitioning to a new repository structure, with tinyos-main eventually becoming an archive. It utilizes a distributed version control system (Git) to encourage community participation and has upgraded to a Version 3 make build system. Documentation is available for getting started with Git, setting up development environments on Debian/Ubuntu and Mac OS X, and using the TinyOS Wiki for further information.
TypeGPU
TypeGPU is a modular and open-ended toolkit designed to simplify WebGPU development by allowing developers to write shaders directly in TypeScript. It offers advanced type inference, ensuring type safety throughout the development process. The toolkit provides a robust abstraction layer that addresses common WebGPU challenges while maintaining flexibility, allowing users to granularly eject into vanilla WebGPU when needed. This approach prevents vendor lock-in and makes TypeGPU an excellent foundation for building WebGPU applications or integrating into existing projects. It also serves as an interoperability layer for various type-safe WebGPU libraries, facilitating seamless data flow without copying back to CPU-accessible memory.
YOLOv11-RGBT
YOLOv11-RGBT offers a comprehensive single-stage multispectral object detection framework, extending the capabilities of YOLO models (from YOLOv3 to YOLOv13) and RTDETR to handle RGBT (Red, Green, Blue, Thermal) data. This project simplifies the configuration of visible and infrared datasets for multimodal object detection tasks, providing three distinct configuration methods. It supports multi-spectral object detection, keypoint detection, and instance segmentation. The framework is adaptable to various pixel-aligned images, including depth maps and SAR images, not just multispectral. Key features include support for TIFF images, 16-bit multi-spectral datasets with arbitrary channels, and various image formats like Gray, BGR, RGBT, and Multispectral with flexible channel configurations.
BobTheSmuggler
"Bob the Smuggler" is an open-source tool designed to perform HTML Smuggling Attacks, enabling users to embed 7z/zip archives within HTML files. It compresses binaries (EXE/DLL) into password-protected 7z/zip formats, XOR encrypts the archive, and then conceals it inside PNG/GIF image files, creating image polyglots. The tool supports various payload delivery chains, including embedding directly into HTML, SVG, or through image files. Key features include stealthy file concealment, versatile embedding options, advanced obfuscation, and custom template support. It also offers an intuitive command-line interface and visual validation for PNG files. Pre-requisites involve installing specific Python libraries like `python-magic`, `py7zr`, and `pyminizip`.
ArduinoJson
ArduinoJson is a highly efficient and simple C++ JSON library specifically designed for Arduino and other embedded systems. It offers robust JSON deserialization and serialization capabilities, including support for UTF-16 escape sequences, comments, and input filtering. Beyond JSON, it also handles MessagePack serialization and deserialization. The library is optimized for embedded environments, consuming less RAM and performing faster than alternative solutions. It is highly versatile, supporting custom allocators, various string types (String, std::string, std::string_view), and custom readers/writers. ArduinoJson is portable, compatible with C++11, C++14, and C++17, and works across a wide range of boards and development environments, making it a reliable choice for IoT and embedded C++ projects.
Awesome-Vision-Mamba-Models
Awesome-Vision-Mamba-Models is an open-source GitHub repository dedicated to the rapidly evolving field of visual Mamba models. It functions as a comprehensive resource, offering a survey of existing models and exploring new outlooks and advancements in the domain. The repository is actively maintained and updated with the latest research papers and developments, making it an invaluable hub for researchers, academics, and practitioners working with or interested in visual Mamba. Its structure allows for easy navigation through various models and related information, fostering knowledge sharing and collaboration within the AI community.
Awesome-VLA4AD
Awesome-VLA4AD is a comprehensive and continuously updated repository dedicated to Vision–Language–Action models for Autonomous Driving (VLA4AD). It serves as the companion resource to a survey paper, offering a curated collection of research papers, datasets, and tools in the field. The repository categorizes VLA4AD advancements into stages, from explanatory perception modules to end-to-end reasoning and control architectures. It details various models, their key features, and links to their respective papers and codebases. Additionally, it lists relevant datasets and benchmarks, making it an invaluable resource for researchers, academics, and engineers working on autonomous driving systems.