Coding & Development
Browsing page 179 of AI tools for Open Source & Models in Coding & Development. Sorted by confidence score — our independent quality rating.
Awesome-DLMs
Awesome-DLMs is the official GitHub repository for the survey paper "A Survey on Diffusion Language Models." It serves as a highly-starred, comprehensive, and up-to-date collection of research papers, code, and resources related to Diffusion Language Models. The repository categorizes DLMs into continuous, discrete, and multimodal types, highlighting key milestones in their development. It includes sections for must-read papers, surveys, foundational concepts, training strategies, inference optimization, training frameworks, benchmarks, and applications. This resource is invaluable for researchers, students, and practitioners looking to explore the latest advancements and foundational knowledge in the field of Diffusion Language Models.
awesome-contrastive-self-supervised-learning
awesome-contrastive-self-supervised-learning is an open-source GitHub repository offering a comprehensive and curated list of research papers focused on contrastive self-supervised learning. This resource is invaluable for academics, researchers, and students looking to stay updated with the latest advancements and foundational works in this rapidly evolving AI domain. The repository categorizes papers by year, ranging from 2010 to 2024, and includes surveys, reviews, and specific research contributions, often with links to associated code. It covers diverse applications such as medical image analysis, vision-language representation, graph representations, and natural language understanding, making it a central hub for exploring the theoretical and practical aspects of contrastive learning.
Awesome-Deblurring
Awesome-Deblurring is a comprehensive, curated list of resources dedicated to image and video deblurring. Hosted on GitHub, this open-source repository serves as a central hub for researchers and developers seeking to explore or implement deblurring techniques. It meticulously categorizes resources into various sections, including single-image blind motion deblurring (both non-DL and DL approaches), non-blind deblurring, depth-aware motion deblurring, defocus deblurring, and benchmark datasets. Each entry typically includes the publication year, paper title, and links to associated code or project pages, making it an invaluable tool for navigating the vast landscape of deblurring research and practical applications.
awesome-deep-rl
awesome-deep-rl is a comprehensive, curated list of resources for Deep Reinforcement Learning. This open-source repository serves as a central hub for researchers and practitioners to discover libraries, benchmark results, environments, competitions, and educational materials like books and tutorials. It covers a wide array of topics, from foundational algorithms and historical timelines to advanced frameworks and simulation platforms, making it an invaluable reference for anyone involved in the field of Deep Reinforcement Learning. The resource is continuously updated, reflecting the dynamic nature of AI research.
Awesome-BEV-Perception-Multi-Cameras
Awesome-BEV-Perception-Multi-Cameras is a valuable resource for researchers and engineers focused on multi-camera 3D object detection and segmentation within the Bird's-Eye-View (BEV) paradigm. This curated list compiles significant academic papers, including influential works like DETR3D, BEVDet, BEVFormer, BEVDepth, and UniAD. It categorizes papers by key themes such as Longterm BEV, BEV + Stereo, End to End BEV Perception, BEV + Distillation, Robust BEV, Fast BEV, HD Map Construction, Multi-sensor fusion, Survey, Occupancy Network, and Pre-training. Each entry typically includes a link to the paper and its corresponding GitHub repository, making it easy for users to access the research and associated codebases. This tool is essential for staying updated with the latest advancements in vision-centric autonomous driving perception.
canvas-editor
canvas-editor is an open-source rich text editor designed for web applications, leveraging canvas and SVG for rendering. It offers a comprehensive suite of rich text operations, including undo/redo, font styling, alignment, and list management. Developers can easily insert various elements such as tables, images, links, code blocks, page breaks, and mathematical formulas. The editor also supports printing to picture and PDF, controls like select, text, date, radio, and checkbox, and features like context menus, shortcut keys, drag and drop functionality, headers, footers, page numbers, page margins, watermarks, pagination, and comments. It is ideal for creating custom text editing experiences within web applications.
chronos-forecasting
Chronos-forecasting is an open-source project by Amazon Science that provides a family of pretrained models for time series forecasting. It includes Chronos-2, offering state-of-the-art zero-shot performance for univariate, multivariate, and covariate-informed forecasting, and Chronos-Bolt, a patch-based variant that is significantly faster and more memory-efficient. The original Chronos models are based on language model architectures, transforming time series into tokens for probabilistic forecasting. The package provides an interface for easy inference via pip installation and offers deployment options to AWS with Amazon SageMaker for reliable production use. It also includes tools like fev for benchmarking time series forecasting models.
DarkPose
DarkPose is an open-source project that introduces a novel Distribution-Aware Coordinate Representation of Keypoint (DARK) method for human pose estimation. This method acts as a model-agnostic plug-in, designed to significantly boost the performance of various existing state-of-the-art human pose estimation models. It has demonstrated impressive results, including achieving 76.4 on the COCO test-challenge (2nd place entry of COCO Keypoints Challenge ICCV 2019) and being accepted by CVPR2020. The project provides detailed results on COCO val2017, COCO test-dev2017, and MPII val datasets, showcasing its effectiveness across different benchmarks. DarkPose is particularly valuable for researchers and developers working on computer vision tasks requiring precise human pose analysis.
cvzone
cvzone is a comprehensive computer vision package designed to streamline image processing and AI functionalities. Built upon the robust OpenCV and Mediapipe libraries, it offers an accessible platform for developers and enthusiasts to implement various computer vision tasks. The package includes modules for face detection, hand tracking, pose estimation, selfie segmentation, and color detection. It also provides utilities for image manipulation like rotating, stacking, and overlaying PNGs, along with functions for finding contours and calculating FPS. With straightforward installation via pip and numerous examples, cvzone makes it easy to integrate advanced computer vision capabilities into projects.
Video-XL
Video-XL is an open-source project offering a family of efficient vision-language models (VLMs) specifically designed for understanding extremely long videos, capable of processing content at an hour scale. The project includes models like Video-XL2 and Video-XL-Pro, which have achieved state-of-the-art results on various long video understanding benchmarks. Video-XL-Pro, for instance, can process up to 10,000 frames on an 80G GPU with only 3 billion parameters. The project provides models, training, and evaluation code, making it a valuable resource for researchers and developers working with extensive video data. It builds upon existing codebases like LongVA and LMMs-Eval for its development and evaluation processes.
Face-Recognition-Attendance-System
Face-Recognition-Attendance-System is an open-source project designed to automate attendance tracking using face detection and recognition. This system aims to reduce manual errors and provide a reliable method for recording attendance. Key features include checking camera feeds, capturing faces, training the system with new faces, recognizing individuals, and automatically recording attendance. It also offers automatic email notifications and screenshot capabilities. Built with Python 3.7, it leverages modules like OpenCV, Pillow, NumPy, Pandas, Shutil, CSV, and yagmail, utilizing Haar Cascade and LBPH algorithms for face recognition. The project is suitable for developers looking to implement or learn about face recognition attendance systems.
elk
Elk is a tiny, embeddable JavaScript engine specifically designed for microcontroller development and embedded systems. It implements a small but usable subset of ES6, enabling developers to integrate JavaScript customizations into firmware primarily written in C/C++. This approach allows for flexible device functionality extensions without rewriting core C/C++ code. Key features include cross-platform compatibility, zero dependencies, easy embedding by simply copying two files, and a small footprint of about 20KB on flash/disk and minimal RAM usage. Elk operates without `malloc`, using only a given memory buffer, and directly interprets JS code without bytecode, making it highly tunable and minimal.
FastV
FastV is an open-source inference acceleration method specifically designed for large vision-language models (LVLMs). It operates as a plug-and-play solution, significantly reducing computational costs by pruning redundant visual tokens in the deeper layers of these models. This approach allows for a theoretical FLOPs reduction of up to 45% without compromising performance. FastV has been accepted to ECCV 2024 as an Oral Presentation, highlighting its innovative contribution to the field. The project provides code for setup, visualization of inefficient attention over visual tokens, and comprehensive evaluation scripts for latency and performance reproduction. It supports HuggingFace LLaVA models and is compatible with KV Cache for improved efficiency, particularly in video understanding tasks.
easyFL
easyFL, also known as FLGo, is an experimental and open-source platform designed for federated learning research. It offers a robust and reusable environment for conducting diverse federated learning experiments, featuring comprehensive and easy-to-use modules. Researchers can simulate real-world system heterogeneity, utilize over 50 benchmarks across various data types and communication topologies, and generate federated tasks with specific data distributions using flexible partitioners. The platform also includes implementations of more than 50 algorithms from top-tier conferences and journals, supporting flexible combinations of benchmarks, partitioners, algorithms, and simulators. It provides experimental tools for loading results and using checkpoints for training recovery.
MTEM Pruner
MTEM Pruner is a specialized tool designed to optimize multilingual text embedding models by reducing their size. It achieves this by allowing users to select a specific language, after which the tool prunes the model to retain only the tokens essential for that chosen language. This process helps in creating more efficient and lightweight models, which is particularly beneficial for deployment in resource-constrained environments or for applications where a focused language model is preferred. Hosted on Hugging Face Spaces, MTEM Pruner provides a straightforward interface for users to select their desired model and language, making advanced model optimization accessible.
fpn.pytorch
fpn.pytorch offers a pure PyTorch implementation of the Feature Pyramid Network (FPN) for object detection, building upon the properties of a faster R-CNN implementation. This project stands out for its complete conversion of all NumPy implementations to PyTorch, ensuring a consistent and efficient environment. A key feature is its support for training with batch sizes greater than one, achieved by revising all relevant layers including dataloader, RPN, and ROI-pooling. It also leverages a multiple GPU wrapper (nn.DataParallel) for flexible scaling across one or more GPUs. The implementation integrates three pooling methods—ROI pooling, ROI align, and ROI crop—all adapted for multi-image batch training. Benchmarking has been conducted on datasets like PASCAL VOC and COCO, demonstrating its performance.
gaussian_splatting_notes
Gaussian Splatting Notes is a free, open-source educational resource offering a comprehensive breakdown of the mathematical formulae behind Gaussian Splatting. This guide, presented as a text version of an explanatory stream, delves into the intricacies of the rasterization process, specifically covering the forward and backward passes. It aims to provide as many details as possible, highlighting core algorithmic concepts and referencing original code snippets to aid understanding. The resource also includes important insights marked with '💡' and clarifies complex topics like 3D covariance reparametrization and 2D Gaussian projection, making it an invaluable aid for those studying this advanced 3D rendering technique.
Object Detection Web
Object Detection Web is a free, web-based AI tool hosted on Hugging Face Spaces, developed by Xenova. It provides a straightforward way to perform object detection on images. Users can easily upload their own images or select from example images to see the application identify and label various objects present. This tool is particularly useful for individuals interested in learning about object detection technology, exploring its capabilities, or for simple task automation where identifying objects in images is required. Its accessible web interface makes it suitable for educational purposes and fun exploration without requiring any technical setup.
Presidio with custom PII models trained on PII data generated by Privy
Presidio with custom PII models is an open-source AI tool designed for the anonymization of personally identifiable information (PII). This tool leverages custom PII models that have been specifically trained on data generated by Privy, enhancing its ability to detect and redact sensitive information. Hosted on Hugging Face, it provides a platform for developers and data scientists to implement robust data privacy and security measures. While the current live website indicates a build error, the tool's core purpose is to facilitate the handling of sensitive data in a secure and compliant manner, making it valuable for various data processing and analysis tasks.
IsaacGymEnvs
IsaacGymEnvs is a collection of reinforcement learning environments specifically designed for the NVIDIA Isaac Gym platform. These environments are optimized for high-performance GPU-based physics simulation, as detailed in the NeurIPS 2021 Datasets and Benchmarks paper. The repository offers an easy-to-use API for creating vectorized environments, supporting various tasks like Ant locomotion, Cartpole, and AllegroHand manipulation. It includes features such as headless training, checkpoint loading, multi-GPU training, population-based training, and integration with Weights & Biases for experiment tracking. The framework also incorporates domain randomization to enhance sim-to-real transfer of trained policies, making it a powerful tool for advanced robot learning research and development.
Image-Adaptive-YOLO
Image-Adaptive-YOLO is an open-source implementation of an object detection model specifically engineered to perform robustly in adverse weather conditions. Based on the research paper "Image-Adaptive YOLO for Object Detection in Adverse Weather Conditions (AAAI 2022)", this tool incorporates image-adaptive filtering techniques to enhance detection accuracy in scenarios like fog, darkness, or other challenging visual environments. The project provides code for installation, dataset preparation (including VOC PASCAL, RTTS, ExDark, and custom foggy/dark datasets), and both training and evaluation scripts. It is built on Python and TensorFlow, making it accessible for researchers and developers working on computer vision tasks in difficult conditions.
mini_racer
MiniRacer provides a minimal, modern embedded V8 JavaScript engine for Ruby, serving as an alternative to the no-longer-maintained therubyracer. It offers a simple two-way bridge, allowing Ruby applications to execute JavaScript snippets in a shared context. Key features include the ability to attach global Ruby functions to JavaScript contexts, return binary data as Uint8Array, and support for GIL-free JavaScript execution, enabling parallel script processing. It also includes timeout and memory softlimit support, rich debugging with file names in stack traces, and fork safety for web servers. Contexts can be thread-safe and created with pre-loaded snapshots for efficiency, which can also be persisted to disk. Users can control memory usage and set V8 runtime flags for experimental features or performance tuning.
morphsnakes
morphsnakes is an open-source Python library providing an implementation of Morphological Snakes for image segmentation and tracking. This tool is designed for both 2D images and 3D volumes, offering a robust alternative to traditional active contour methods like Geodesic Active Contours or Active Contours without Edges. Unlike these traditional approaches that rely on solving PDEs over floating-point arrays, morphsnakes utilizes morphological operators such as dilation and erosion on binary arrays, leading to faster execution and improved numerical stability. The library includes two main methods: Morphological Geodesic Active Contours (MorphGAC) for images with visible contours requiring preprocessing, and Morphological Active Contours without Edges (MorphACWE) which is more robust to noise and suitable when pixel values of inside and outside regions differ significantly. Installation is straightforward via pip or by directly copying the `morphsnakes.py` file.
nerfstudio
nerfstudio is an open-source, collaboration-friendly studio designed for creating, training, and testing Neural Radiance Fields (NeRFs). It provides a simple API that streamlines the end-to-end process of NeRF development, from data capture to rendering. The library supports a modular implementation of NeRFs, making each component more interpretable and easier to build upon. Developed by Berkeley students and community contributors, nerfstudio aims to foster a community where users can easily contribute and explore NeRF technology. It includes a web-based visualizer for real-time training interaction, support for multiple logging interfaces like Tensorboard and Wandb, and full pipeline support for processing data from various devices like phones with LiDAR. The project emphasizes learning resources, tutorials, and documentation to help users get started and advance their understanding of NeRFs.