ShypdShypd.ai
💻

Coding & Development

Browsing page 200 of AI tools for Open Source & Models in Coding & Development. Sorted by confidence score — our independent quality rating.

FaceRecognitionDotNet

FaceRecognitionDotNet

54%

FaceRecognitionDotNet is a .NET port of the popular `face_recognition` Python library, offering a straightforward API for facial recognition tasks. This cross-platform solution supports Windows, MacOS, and Linux environments, making it versatile for various development needs. Key functionalities include face detection, comparison, encoding, and landmark identification. Beyond basic recognition, it also provides capabilities for age, gender, emotion, and head pose prediction, as well as eye blink detection. Developers can integrate these features into their .NET applications, leveraging the power of facial analysis for diverse use cases. The tool emphasizes the need for users to train their own datasets for advanced prediction models, avoiding licensing issues by not providing pre-trained models.

f2-nerf

f2-nerf

54%

f2-nerf is an open-source project designed for fast neural radiance field (NeRF) training, specifically optimized for scenarios involving free camera trajectories. Built primarily on LibTorch, this tool provides a robust framework for efficient 3D scene reconstruction and novel view synthesis. Users can train F2-NeRF on custom data, including images processed with COLMAP or hloc, and generate camera poses. It also includes scripts for rendering test images and creating render paths by interpolating input camera poses. The project leverages several powerful libraries such as tiny-cuda-nn for fast MLP training, happly for PLY I/O, and eigen for linear algebra, making it a comprehensive solution for advanced NeRF applications.

FSGS

FSGS

54%

FSGS, short for "Real-Time Few-Shot View Synthesis using Gaussian Splatting," is an advanced AI tool presented at ECCV 2024. It specializes in generating new views of a scene from a minimal number of input images, leveraging Gaussian Splatting technology for real-time performance. The tool provides comprehensive environmental setups, including Conda package management and CUDA 11.7 support, ensuring a robust development environment. Users can prepare data by reconstructing sparse view inputs using SfM and dense stereo matching with COLMAP, supporting datasets like LLFF and MipNeRF-360. FSGS offers clear instructions for training models with varying view counts, rendering images, and evaluating model performance, making it a valuable resource for researchers and developers in computer vision and graphics.

pointnerf

pointnerf

54%

pointnerf is an open-source implementation of Point-NeRF, a method for modeling radiance fields using neural 3D point clouds with associated neural features. This tool enables efficient rendering by aggregating neural point features near scene surfaces through a ray marching-based pipeline. A key differentiator is its ability to be initialized via direct inference of a pre-trained deep network to produce a neural point cloud, which can then be finetuned for visual quality surpassing NeRF with significantly faster training times. pointnerf also integrates with other 3D reconstruction methods and manages errors and outliers through a novel pruning and growing mechanism, making it suitable for various research applications in computer vision and graphics.

simple-HRNet

simple-HRNet

54%

simple-HRNet is an unofficial yet fully compatible implementation of the Deep High-Resolution Representation Learning for Human Pose Estimation paper, built with PyTorch. This tool simplifies the process of human pose estimation, offering compatibility with official pre-trained weights and delivering results consistent with the original implementation. It supports both Windows and Linux environments and includes features like multi-GPU inference, options for retrieving YOLO bounding boxes and HRNet heatmaps, and multi-person support with YOLOv3, YOLOv3-tiny, or YOLOv5. The repository also provides a live demo, scripts for training and testing on datasets like COCO, and support for TensorRT, making it a versatile solution for developers and researchers in computer vision.

YoloDotNet

YoloDotNet

54%

YoloDotNet is a modular, lightweight C# library built on .NET 8, ONNX Runtime, and SkiaSharp, designed for real-time computer vision and YOLO-based inference. It offers high-performance inference for modern YOLO model families (YOLOv5u through YOLOv26, YOLO-World, YOLO-E, and RT-DETR) without relying on heavy computer vision frameworks like OpenCV or Python runtimes. Developers gain explicit control over execution, memory, and preprocessing, making it ideal for production-ready desktop apps, backend services, and real-time vision pipelines requiring deterministic behavior. It supports various vision tasks including classification, object detection, OBB detection, segmentation, and pose estimation, with flexible execution providers for CPU, CUDA/TensorRT, OpenVINO, CoreML, and DirectML.

YOLOv3

YOLOv3

54%

YOLOv3 is an open-source Keras implementation of the YOLOv3 object detection algorithm, designed for identifying objects within images and videos. This tool requires specific dependencies including OpenCV 3.4, Python 3.6, TensorFlow-gpu 1.5.0, and Keras 2.1.3. Users can quickly get started by downloading official YOLOv3 weights and converting them to a Keras H5 file using the provided `yad2k.py` script. The tool demonstrates improved classification capabilities over its predecessor, YOLOv2. While it currently supports object detection, future development plans include training the model for broader applications. It is a valuable resource for developers and data scientists working on computer vision tasks.

animatable_nerf

animatable_nerf

54%

Animatable_nerf is an open-source research tool that provides the implementation for "Animatable Implicit Neural Representations for Creating Realistic Avatars from Videos," a paper accepted to TPAMI 2024 and ICCV 2021. This tool allows researchers to generate realistic avatars from video footage by leveraging animatable neural fields. It supports various configurations, including vanilla Animatable NeRF, versions with neural blend weight fields replaced by displacement fields, and versions where the canonical NeRF model is replaced with a neural surface field (SDF output). The repository includes evaluation frameworks for reconstruction quality comparison and provides access to datasets like Mobile-Stage and SyntheticHuman++ for further research and development in neural rendering and 3D human body modeling.

SoftTeacher

SoftTeacher

54%

SoftTeacher is an open-source project providing the official implementation of the ICCV2021 paper "End-to-End Semi-Supervised Object Detection with Soft Teacher." This tool enables the training of object detection models using a combination of labeled and unlabeled data, significantly improving model accuracy, especially in scenarios with limited labeled data. It offers configurations for various labeled data percentages (1%, 5%, 10%, and full labeled data) and includes pre-trained model weights. The repository provides detailed instructions for installation, data preparation, training, evaluation, and inference, making it a valuable resource for researchers and developers in computer vision.

SRGAN

SRGAN

54%

SRGAN is a PyTorch implementation of the Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network paper from CVPR 2017. This open-source tool allows users to perform super-resolution on both images and videos, significantly enhancing their quality and detail. It provides options for various upscale factors (2x, 4x, 8x) and includes benchmarks for performance on different datasets. Users can train their own models, test on benchmark datasets, or apply super-resolution to single images and videos using pre-trained models. The project is hosted on GitHub and requires Anaconda, PyTorch, and OpenCV for setup.

SRGAN-tensorflow

SRGAN-tensorflow

54%

SRGAN-tensorflow offers a TensorFlow implementation of the SRGAN algorithm, designed for single image super-resolution. This project is based on the impressive work "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network." It allows users to upscale images, achieving results comparable to those presented in the original research paper, even with limited resources. The tool supports both testing with pre-trained models and training new models on custom datasets like RAISE. It provides scripts for running inference, testing, and training SRResnet and SRGAN models with different perceptual losses (MSE and VGG54). The code is highly inspired by pix2pix-tensorflow and includes detailed instructions for setting up dependencies and executing various modes.

Tensorflow_Object_Tracking_Video

Tensorflow_Object_Tracking_Video

54%

Tensorflow_Object_Tracking_Video is an open-source project developed for object tracking in videos, encompassing localization, detection, and classification. Originally created for the ImageNET VID competition, it leverages TensorFlow technology. The project integrates popular object detection systems like YOLO (You Only Look Once) and TensorBox, along with Inception for classification. It features a modular architecture that includes a general object detector, a tracker, and a smoother. The repository provides scripts for both YOLO and VID TENSORBOX usage, allowing users to process videos, set parameters, and obtain real-time object tracking results. It also includes dataset scripts for preparing and processing data for training, particularly for the VID classes, and offers pre-trained weights for Inception and TensorBox.

TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10

TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10

54%

This GitHub repository offers a comprehensive tutorial for training a TensorFlow object detection classifier to detect multiple objects on Windows 10, 8, or 7. It covers the entire process, from installing necessary software like Anaconda, CUDA, and cuDNN, to setting up the TensorFlow Object Detection API directory structure. The tutorial details how to gather and label pictures, generate training data, create a label map, configure and train the model, and finally, export and test the inference graph. It also includes Python scripts for testing the classifier on images, videos, or webcam feeds, and provides files for training a "Pinochle Deck" playing card detector as an example.

visual_anagrams

visual_anagrams

54%

visual_anagrams is an open-source tool specifically designed for generating multi-view optical illusions. It leverages advanced diffusion models to create these unique visual effects. The tool offers readily available code, making it accessible for hands-on experimentation. It also includes Colab notebooks, catering to both free and Pro tier users, to facilitate the creation of visual anagrams and exploration of factorized diffusion techniques. This makes it a valuable resource for those interested in the intersection of AI and visual art.

voicetree

voicetree

54%

Voicetree is an open-source spatial Integrated Development Environment (IDE) specifically built for orchestrating multiple AI agents. It features an interactive graph-view interface, enabling users to work directly within a visual representation of their AI agent ecosystem. Within this environment, nodes can serve various purposes, including representing markdown notes or acting as terminal-based AI agents such as Claude Code and Gemini. A key capability of Voicetree is that agents can spawn sub-agents and access nearby nodes to gather context, facilitating complex AI workflows and interactions.

yolov5_obb

yolov5_obb

54%

yolov5_obb is an open-source project that extends the popular Yolov5 framework for oriented object detection. It integrates Circular Smooth Label (CSL) to accurately detect objects with arbitrary rotations, making it highly suitable for specialized computer vision tasks. The repository provides pre-trained models and detailed results on DOTA datasets, including mAP scores for various versions and speed benchmarks on different hardware. Users can reproduce examples for validation and testing, and the project includes comprehensive documentation for installation and getting started. It's a valuable resource for researchers and developers working on rotation detection in aerial imagery and similar domains.

lsp-ai

lsp-ai

54%

LSP-AI is an open-source language server designed to bring AI capabilities directly into code editors. It provides functionalities such as in-editor chatting with Large Language Models (LLMs), allowing developers to interact with AI without leaving their coding environment. Additionally, LSP-AI offers intelligent code completions to streamline the coding process and enhance productivity. The tool is built to empower software engineers by integrating advanced AI assistance seamlessly into their workflow, and it is compatible with any code editor that supports the Language Server Protocol (LSP).

MassGen

MassGen

54%

MassGen is an open-source, terminal-based multi-agent scaling system. It is designed to autonomously orchestrate advanced AI models and agents, enabling them to work together effectively. The system facilitates collaboration and reasoning among these AI entities to tackle complex problems and generate high-quality outcomes. By coordinating AI workflows, MassGen aims to enhance problem-solving capabilities through a scalable and integrated approach.

MVSGaussian

MVSGaussian

54%

MVSGaussian is an open-source project designed for efficient 3D reconstruction using Gaussian Splatting from multi-view stereo (MVS) data. This tool can reconstruct unseen scenes from sparse views in a single forward pass, providing high-quality initialization for rapid training and real-time rendering. It leverages MVS to encode geometry-aware Gaussian representations and decodes them into Gaussian parameters. MVSGaussian also features a hybrid Gaussian rendering approach for novel view synthesis and a multi-view geometric consistent aggregation strategy to effectively initialize per-scene optimization. Compared to NeRF-based methods, MVSGaussian achieves superior view synthesis quality with reduced training computational costs and real-time rendering speeds, making it valuable for computer vision research and 3D modeling applications.

mmf

mmf

54%

mmf is a modular framework developed by Facebook AI Research (FAIR) for conducting vision and language multimodal research. It offers reference implementations of state-of-the-art vision and language models, making it a valuable resource for researchers. The framework is built on PyTorch, supports distributed training, and is designed to be un-opinionated, scalable, and fast. mmf can be used to bootstrap new vision and language multimodal research projects and serves as a starter codebase for challenges involving vision and language datasets, such as The Hateful Memes, TextVQA, TextCaps, and VQA challenges. It was formerly known as Pythia.

BioMedIA

BioMedIA

54%

BioMedIA is an AI tool hosted on Hugging Face Spaces, designed to facilitate the exploration of AI applications within the biomedical field. While the live website indicates a build error, its intended purpose is to serve as a platform for understanding how AI can be applied in biomedical research and educational contexts. The tool is available for free, making it accessible for a wide range of users interested in the intersection of AI and biomedicine. It is suitable for researchers, students, and healthcare professionals who wish to delve into the capabilities and potential of AI in this specialized domain.

motpy

motpy

54%

motpy is a Python library designed for multi-object tracking using the tracking-by-detection paradigm. It offers a straightforward yet robust baseline for developers to implement object tracking without needing to build the entire algorithmic stack from scratch. Key features include IOU and optional feature similarity matching, Kalman filters for modeling object trackers, and configurable system orders for object position and size. The library is optimized for performance, achieving real-time tracking even on resource-constrained devices like the Raspberry Pi. It supports various use cases, from synthetic 2D tracking to detecting and tracking objects in videos and webcam face tracking, making it a versatile tool for computer vision applications.

safe-control-gym

safe-control-gym

54%

safe-control-gym offers physics-based CartPole and Quadrotor Gym environments built using PyBullet, featuring symbolic a priori dynamics powered by CasADi. This framework is designed for learning-based control, as well as model-free and model-based reinforcement learning (RL). It includes symbolic safety constraints and implements input, parameter, and dynamics disturbances to rigorously test the robustness and generalizability of various control approaches. The tool provides a unified benchmark suite for safe learning-based control and RL in robotics, supporting a range of implemented controllers like PID, LQR, iLQR, MPC, SAC, and PPO, alongside safety filters such as MPSC and CBF. It also offers performance comparisons against other popular Gym environments.

SC-GS

SC-GS

54%

SC-GS provides code for Sparse-Controlled Gaussian Splatting, designed for editable dynamic scenes. This open-source tool allows users to effortlessly edit and customize their digital assets through interactive features. It represents motion using sparse control points, which drive 3D Gaussians for high-fidelity rendering. The approach supports both dynamic view synthesis and motion editing, making it versatile for various applications. Recent updates include support for editing static Gaussians from .ply files, improved handling of real-world static objects, and video rendering with interpolation of editing results. It offers two ARAP deformation strategies for motion editing: iterative deformation and deformation from Laplacian initialization, giving users flexibility in achieving desired effects.