Coding & Development
Browsing page 502 of AI tools for Coding & Development. Sorted by confidence score — our independent quality rating.
describe-anything
Describe Anything (DAM) is an open-source project from NVlabs, UC Berkeley, and UCSF, providing an implementation for detailed localized image and video captioning. This tool allows users to input a region of an image or video using points, boxes, scribbles, or masks, and then outputs detailed textual descriptions of that specific region. For videos, annotation on any single frame is sufficient. DAM also introduces DLC-Bench, a new benchmark for evaluating models on the detailed localized captioning task. It offers various installation methods, interactive demos, and command-line examples for both image and video processing, including integration with SAM for automated mask generation. An OpenAI-compatible API is also available for seamless integration.
DPIR
DPIR (Deep Plug-and-Play Image Restoration) is an open-source project implemented in PyTorch, focusing on advanced image restoration techniques. It leverages a deep denoiser prior within a model-based framework to address various inverse problems in image processing. The tool excels in tasks such as deblurring, super-resolution, denoising, and demosaicing, offering performance that often surpasses state-of-the-art model-based methods and competes with learning-based approaches. DPIR is particularly notable for its DRUNet denoiser, which demonstrates robust performance even on extremely high, unseen noise levels, making it a powerful solution for challenging image restoration scenarios.
dqlite
dqlite is an open-source C library that provides an embeddable, replicated, and fault-tolerant SQL database engine. It builds upon SQLite by adding a network protocol, enabling multiple application instances to form a highly-available cluster without relying on external databases. Key design highlights include an asynchronous single-threaded implementation using libuv, a custom wire protocol optimized for SQLite data types, and data replication based on the Raft algorithm. dqlite is compatible with Linux kernels supporting native async I/O and offers a modified LGPLv3 license allowing static linking. It's ideal for developers seeking a robust, self-contained, and highly available SQL solution for their applications.
nnDetection
nnDetection is a self-configuring framework designed for 3D (volumetric) medical object detection, addressing the challenge of cumbersome method configuration in medical image analysis. Following the success of nnU-Net for image segmentation, nnDetection systematizes and automates the configuration process, allowing it to adapt to arbitrary medical detection problems without manual intervention. It achieves results comparable to or superior to state-of-the-art methods. The framework includes guides for 12 datasets used in its development and evaluation, such as ADAM and LUNA16, and supports easy integration of new datasets through a standardized input format. It is built with Python 3.8+, PyTorch, and uses Docker for easy deployment.
mvpose
mvpose is an open-source project providing code for fast and robust multi-person 3D pose estimation from multiple views. Developed by zju3dv, it is based on research published in CVPR 2019 and T-PAMI 2021. The tool includes functionalities for setting up a Python environment, compiling necessary backend libraries, and preparing models and datasets for use. It supports datasets like Shelf and CampusSeq1, with detailed instructions for generating camera parameters. Users can run demos and evaluate performance on these datasets, with options to accelerate evaluation by saving predicted 2D poses and heatmaps. The project leverages components from Light head rcnn, Cascaded Pyramid Network, and CamStyle, making it a valuable resource for advanced computer vision research.
nerf-pytorch
nerf-pytorch is a faithful PyTorch implementation of Neural Radiance Fields (NeRF), a method renowned for achieving state-of-the-art results in synthesizing novel views of complex scenes. This open-source project successfully reproduces the original NeRF results while offering a performance improvement, running 1.3 times faster than the authors' initial TensorFlow implementation. It provides a robust framework for researchers and developers to experiment with NeRF, including tools for downloading example datasets, training models, and rendering new views. The repository also includes pre-trained models for various scenes, facilitating reproducibility and quick experimentation. It is designed for those familiar with Python and PyTorch, offering a direct path to leveraging NeRF technology.
nerfacc
nerfacc is a PyTorch-based acceleration toolbox specifically designed for Neural Radiance Fields (NeRFs), optimizing both training and inference processes. It emphasizes efficient volumetric sampling using computationally cheap estimators to discover surfaces, making it universal and plug-and-play for most NeRF models. Users can integrate nerfacc with minimal code modifications by defining `sigma_fn` for density computation and `rgb_sigma_fn` for color and density, enabling significant speedups. The toolbox supports various NeRF papers and offers a pure Python interface with flexible APIs. Installation is straightforward via PyPI or source, with pre-built wheels available for major PyTorch and CUDA combinations.
energy
Energy is a robust GUI framework developed in Go, leveraging LCL and CEF (Chromium Embedded Framework) to facilitate the creation of cross-platform desktop applications. It supports Windows, macOS, and Linux, allowing developers to build native applications using familiar web technologies like HTML, CSS, and JavaScript. The framework offers a rich CEF API and LCL system native widgets, ensuring a simple development environment with fast compilation speeds. Developers can integrate mainstream front-end frameworks such as Vue, React, or Angular. Energy also features high-performance event-driven communication between Go and Web components via IPC, and flexible resource loading from local files or embedded resources.
NTIRE2017
NTIRE2017 is an open-source project offering a Torch implementation of "Enhanced Deep Residual Networks for Single Image Super-Resolution." Developed by Team SNU_CVLab, it was recognized with the Best Paper Award at the CVPR 2017 Workshop (2nd NTIRE). The repository includes detailed model architectures (EDSR, MDSR), NTIRE2017 Super-resolution Challenge results, and demo and training code. Users can access trained models, information on datasets like DIV2K and Flickr2K, and super-resolution examples. The code is based on Facebook's Torch implementation of ResNet and also provides a PyTorch version for some models. It's designed for researchers and developers working on image restoration and enhancement, particularly in the field of single image super-resolution.
efficientdet
efficientdet is a PyTorch implementation of the EfficientDet object detection model, developed by Signatrix GmbH. This open-source tool provides scalable and efficient object detection capabilities, making it suitable for various computer vision tasks. It includes pre-trained weights, allowing users to get started quickly without extensive training. The repository offers scripts for training models, evaluating mean average precision (mAP) on datasets like COCO, and testing models on both datasets and video inputs. It supports Python 3.6 and PyTorch 1.2, along with other common libraries like OpenCV and TensorBoard. The implementation borrows concepts from RetinaNet, providing a robust framework for object detection research and application.
Epiclips
Epiclips is a free, open-source AI video clipping tool designed to transform long videos into engaging short-form content. It operates entirely within the browser, prioritizing user privacy by processing videos locally. This eliminates the need for subscriptions or payments, making it an accessible solution for content creators. The tool leverages WebGPU technology to efficiently process videos and generate viral clips without requiring cloud-based services.
FCOS
FCOS (Fully Convolutional One-Stage Object Detection) is an open-source project that provides an implementation of the FCOS algorithm for object detection. This tool is designed to completely avoid the complex computations and hyper-parameters associated with anchor boxes, offering a simpler and more efficient approach. It achieves better performance than Faster R-CNN, with significantly faster training and inference times. FCOS supports various backbones including ResNet, ResNeXt, and MobileNet, and offers models with state-of-the-art performance, reaching up to 49.0% AP on COCO test-dev. The project includes detailed instructions for installation, testing, and training, making it suitable for researchers and developers working on computer vision applications.
FAST-LIVO2
FAST-LIVO2 is an efficient and accurate open-source LiDAR-inertial-visual fusion localization and mapping system. It is designed for real-time 3D reconstruction and onboard robotic localization, particularly in severely degraded environments. The system integrates data from LiDAR, inertial measurement units, and visual sensors to provide robust odometry. Key features include its direct fusion approach, support for resource-constrained platforms, and an associated dataset for evaluation. The project also provides resources for building a hard-synchronized handheld device, including CAD files and source code, making it a comprehensive solution for developers working on autonomous navigation and robotics.
FaceRecognitionDotNet
FaceRecognitionDotNet is a .NET port of the popular `face_recognition` Python library, offering a straightforward API for facial recognition tasks. This cross-platform solution supports Windows, MacOS, and Linux environments, making it versatile for various development needs. Key functionalities include face detection, comparison, encoding, and landmark identification. Beyond basic recognition, it also provides capabilities for age, gender, emotion, and head pose prediction, as well as eye blink detection. Developers can integrate these features into their .NET applications, leveraging the power of facial analysis for diverse use cases. The tool emphasizes the need for users to train their own datasets for advanced prediction models, avoiding licensing issues by not providing pre-trained models.
f2-nerf
f2-nerf is an open-source project designed for fast neural radiance field (NeRF) training, specifically optimized for scenarios involving free camera trajectories. Built primarily on LibTorch, this tool provides a robust framework for efficient 3D scene reconstruction and novel view synthesis. Users can train F2-NeRF on custom data, including images processed with COLMAP or hloc, and generate camera poses. It also includes scripts for rendering test images and creating render paths by interpolating input camera poses. The project leverages several powerful libraries such as tiny-cuda-nn for fast MLP training, happly for PLY I/O, and eigen for linear algebra, making it a comprehensive solution for advanced NeRF applications.
FSGS
FSGS, short for "Real-Time Few-Shot View Synthesis using Gaussian Splatting," is an advanced AI tool presented at ECCV 2024. It specializes in generating new views of a scene from a minimal number of input images, leveraging Gaussian Splatting technology for real-time performance. The tool provides comprehensive environmental setups, including Conda package management and CUDA 11.7 support, ensuring a robust development environment. Users can prepare data by reconstructing sparse view inputs using SfM and dense stereo matching with COLMAP, supporting datasets like LLFF and MipNeRF-360. FSGS offers clear instructions for training models with varying view counts, rendering images, and evaluating model performance, making it a valuable resource for researchers and developers in computer vision and graphics.
pointnerf
pointnerf is an open-source implementation of Point-NeRF, a method for modeling radiance fields using neural 3D point clouds with associated neural features. This tool enables efficient rendering by aggregating neural point features near scene surfaces through a ray marching-based pipeline. A key differentiator is its ability to be initialized via direct inference of a pre-trained deep network to produce a neural point cloud, which can then be finetuned for visual quality surpassing NeRF with significantly faster training times. pointnerf also integrates with other 3D reconstruction methods and manages errors and outliers through a novel pruning and growing mechanism, making it suitable for various research applications in computer vision and graphics.
rp-hal
rp-hal offers a comprehensive Rust Embedded-HAL solution for the Raspberry Pi RP2040 and RP235x series microcontrollers. This repository provides high-level drivers for the internal peripherals of these MCUs, such as SPI, I²C, and UART controllers, facilitating the development of embedded applications in Rust. It includes specific HALs for both RP2040 and RP235x, along with common shared code. Developers can find numerous examples for functionalities like GPIO control, I²C communication, SPI, UART, PWM, PIO, and RTC. The project also supports generating picotool-compatible metadata for Rust binaries and provides guidance on programming with various targets and loading methods like USB with picotool or SWD with probe-rs. It's an active open-source project with a clear roadmap for future development.
simple-HRNet
simple-HRNet is an unofficial yet fully compatible implementation of the Deep High-Resolution Representation Learning for Human Pose Estimation paper, built with PyTorch. This tool simplifies the process of human pose estimation, offering compatibility with official pre-trained weights and delivering results consistent with the original implementation. It supports both Windows and Linux environments and includes features like multi-GPU inference, options for retrieving YOLO bounding boxes and HRNet heatmaps, and multi-person support with YOLOv3, YOLOv3-tiny, or YOLOv5. The repository also provides a live demo, scripts for training and testing on datasets like COCO, and support for TensorRT, making it a versatile solution for developers and researchers in computer vision.
java-html-sanitizer
The OWASP Java HTML Sanitizer is an open-source library designed to protect web applications from Cross-Site Scripting (XSS) attacks by sanitizing untrusted HTML content. Written in Java, it offers a fast and easily configurable solution for developers to safely embed third-party HTML. The tool provides prepackaged policies for common sanitization needs, as well as the flexibility to craft custom policies for specific requirements, such as transforming elements or adding attributes. It also includes a preprocessor feature for structural changes before policy application and telemetry to track discarded elements or attributes, aiding in security monitoring. The project emphasizes security best practices, an extensive test suite, and adversarial security review.
lemon
Lemon is an open-source, embeddable, and lightweight programming language designed for flexibility and integration. It comes with the complete source code for its compiler and virtual machine, allowing developers to understand and modify its core functionality. Additionally, the source code for the core Lemon library is provided, facilitating custom development and porting to various environments. It supports building on Windows via TDM-GCC and offers options for dynamic or static linking, and includes built-in libraries for POSIX OS and BSD Socket functionalities. This makes Lemon a versatile choice for developers looking for a customizable programming language.
lightweight-human-pose-estimation-3d-demo.pytorch
This repository offers a real-time 3D multi-person pose estimation demo built with PyTorch. It leverages the Lightweight OpenPose and Single-Shot Multi-Person 3D Pose Estimation From Monocular RGB papers to detect and track 2D and 3D coordinates of up to 18 keypoints, including ears, eyes, nose, neck, shoulders, elbows, wrists, hips, knees, and ankles. The model was trained on MS COCO and CMU Panoptic datasets, achieving 100 mm MPJPE on the CMU Panoptic subset. For enhanced performance, it supports Intel OpenVINO for fast inference on CPUs and NVIDIA TensorRT for accelerated inference on Jetson devices, offering significant speedups.
lms
lms is a command-line tool designed to interact with LM Studio, a local AI model environment. It ships with LM Studio versions 0.2.22 and newer, offering a robust interface for developers to manage their AI models and server operations. Key functionalities include checking LM Studio status, starting and stopping the local API server, listing downloaded and loaded models, and managing model loading and unloading. The tool also supports creating new projects with the LM Studio SDK and streaming logs, making it an essential utility for scripting and automating tasks within the LM Studio ecosystem. It provides options for machine-readable JSON output for programmatic use.
lv_port_pc_visual_studio
lv_port_pc_visual_studio offers pre-configured Visual Studio projects specifically designed for the LVGL embedded graphics library. This tool is ideal for traditional Visual Studio users working on Windows, providing MSBuild-based project support. It allows developers to try LVGL on a Windows PC with minimal setup, relying only on Win32 API, C Runtime, and C++ STL. The project supports x86, x64, and ARM64 Windows, including features like LVGL pointer, keypad, encoder device integration, Windows touch input, and per-monitor DPI awareness. It also offers both Simulator Mode for UI layout testing and Application Mode for desktop application development, making it a versatile tool for embedded GUI development on Windows.