Coding & Development
Browsing page 183 of AI tools for Open Source & Models in Coding & Development. Sorted by confidence score — our independent quality rating.
compromise
compromise is an open-source JavaScript library designed to simplify natural language processing tasks. It provides core functionalities for analyzing text, breaking it down into tokens, and identifying parts of speech. The library's primary goal is to make NLP more accessible and straightforward for developers to integrate into their applications, focusing on modest NLP requirements rather than complex, large-scale models.
learnable-triangulation-pytorch
Learnable-triangulation-pytorch is an official PyTorch implementation of the paper "Learnable Triangulation of Human Pose" (ICCV 2019, oral). This open-source project focuses on 3D human pose estimation from multiple cameras, offering two novel methods: Algebraic and Volumetric learnable triangulation. These methods significantly outperform previous state-of-the-art techniques, with the Volumetric model achieving a 2.4 times reduction in error. The repository provides code for training and evaluation, supports both single and multi-GPU setups, and includes pretrained models and configurations for the Human3.6M dataset. It is designed for researchers and engineers working on advanced computer vision tasks, particularly in human pose estimation.
darknet_ros
darknet_ros is a ROS (Robot Operating System) package designed for real-time object detection in camera images, leveraging the You Only Look Once (YOLO) system. It supports YOLO V3 on both GPU and CPU, offering significant speed advantages with CUDA-enabled GPUs. The package comes with pre-trained models capable of detecting objects from VOC and COCO datasets, and also allows users to train and deploy networks with their own custom detection objects. It provides ROS-related parameters for configuring publishers, subscribers, and actions, making it highly adaptable for robotics applications. The tool is open-source and actively maintained by leggedrobotics, providing a robust solution for integrating advanced object detection into robotic systems.
mega.pytorch
mega.pytorch offers an official PyTorch implementation of the "Memory Enhanced Global-Local Aggregation for Video Object Detection" (MEGA) approach, which was accepted by CVPR 2020. This repository is built upon maskrcnn_benchmark and includes training scripts to replicate results on ImageNet VID. Beyond MEGA, it also implements other video object detection algorithms like FGFA and RDN, welcoming contributions for new methods. The project aims to support further research in video object detection, providing pretrained models and detailed instructions for installation, data preparation, inference, and training.
DeepDanbooru
DeepDanbooru is an AI-based multi-label image classification system specifically designed for anime-style girl images. Built with TensorFlow, it provides a robust solution for estimating tags on visual content. The system is open-source and available on GitHub, allowing developers and researchers to access and modify its codebase. Users can prepare their own datasets or utilize tools like DanbooruDownloader to acquire data. It supports creating training projects, downloading tags from Danbooru, filtering datasets, and training custom models. The tool is ideal for those looking to categorize and analyze large collections of anime imagery with AI-driven tagging.
Deep_Object_Pose
Deep Object Pose Estimation (DOPE) is NVIDIA's official repository for advanced object pose estimation. This tool is designed to detect and estimate the 6-DoF pose of known objects using data from an RGB camera. The repository provides comprehensive code for various stages of the pipeline, including training models, performing inference, conducting numerical evaluation of results, and generating synthetic data. It supports integration with ROS1 Noetic for USB camera inference and offers hardware-accelerated ROS2 inference through the external NVIDIA Isaac ROS DOPE project. The tool has been tested on Ubuntu with Python 3.8+ and various NVIDIA GPUs, making it suitable for developers and researchers working on robotics and computer vision projects requiring precise object pose estimation.
DeepEMD
DeepEMD offers a PyTorch implementation for few-shot image classification, based on the research paper "DeepEMD: Few-Shot Image Classification with Differentiable Earth Mover's Distance and Structured Classifiers." This tool is designed to address the challenge of learning from limited labeled data by employing the Earth Mover's Distance (EMD) as a metric for structural matching between image regions. It includes a cross-reference mechanism to mitigate issues from cluttered backgrounds and intra-class variations, and supports k-shot classification through a structured fully connected layer. DeepEMD has demonstrated significant performance improvements on benchmarks like miniImageNet, tieredImageNet, FC100, and CUB, without requiring extra training or testing data. The repository provides code for model pre-training, meta-training, and evaluation, along with options for different EMD solvers and model configurations.
DeepRL-Tutorials
DeepRL-Tutorials is an open-source repository offering high-quality implementations of various Deep Reinforcement Learning (DRL) algorithms, primarily written in PyTorch. The project emphasizes readability and understanding, making it an excellent resource for those looking to learn and practice DRL concepts. It includes implementations of algorithms such as DQN, Double DQN, Dueling DQN, Rainbow, A2C, PPO, and more, each accompanied by relevant research papers. The tutorials are presented as IPython Notebooks, providing a structured way to explore and experiment with these advanced AI techniques. It requires Python 3.6, Numpy, Gym, Pytorch 0.4.0, Matplotlib, and OpenCV.
mongoose
Mongoose is a robust, open-source network library for C/C++ that provides event-driven, non-blocking APIs for various protocols including TCP, UDP, HTTP, WebSocket, and MQTT. Designed for embedded systems and IoT applications, it facilitates connecting devices and bringing them online. Mongoose boasts cross-platform compatibility, working across Linux/UNIX, MacOS, Windows, Android, and various microcontrollers like ST, NXP, and ESP32. It features a tiny static and run-time footprint, is easy to integrate by simply copying two files, and includes a built-in TCP/IP stack with drivers for bare metal or RTOS systems. Mongoose also supports running on existing TCP/IP stacks like lwIP and Zephyr, and includes a built-in TLS 1.3 ECC stack, with options for external TLS libraries.
Typestamp
Typestamp is an innovative open-source protocol designed to verify the authenticity and human effort behind digital content, particularly written text. It aims to combat the proliferation of AI-generated content and low-effort spam by providing 'proof of effort' through keystroke audits and other verifiable metrics. This tool is invaluable for content creators, online communities, platform moderators, and anyone concerned with maintaining the integrity of human-generated discourse. By offering a transparent method to demonstrate genuine human input, Typestamp helps foster trust and ensures that valuable, original content stands out in an increasingly automated digital landscape. It empowers users to distinguish between authentic human expression and machine-generated text, promoting a healthier online environment.
docs
Bytez is a comprehensive platform designed to simplify the discovery, understanding, and deployment of AI models and research papers. It offers access to over 175,000 serverless AI models via a unified API protocol, eliminating the need for complex infrastructure or orchestration. Additionally, Bytez provides access to over 440,000 interactive AI papers, complemented by an ArXiv Agent that delivers grounded answers citing real sources. The platform includes a Model Hub for searching, demoing, and deploying state-of-the-art models across 33 ML tasks, and official Docker images for local or cloud deployment. Bytez aims to be a one-stop solution for developers and researchers working with AI.
neural-combinatorial-rl-pytorch
neural-combinatorial-rl-pytorch offers a PyTorch implementation of Neural Combinatorial Optimization with Reinforcement Learning, based on the research paper. This open-source tool provides a basic RL pretraining model that utilizes greedy decoding. A notable feature is its use of an exponential moving average critic instead of a traditional critic network, which has been shown to significantly improve results, particularly for the Traveling Salesperson Problem (TSP). The implementation supports a stochastic decoding policy during training and beam search for testing. It currently includes support for a sorting task and the planar symmetric Euclidean TSP, with clear guidelines for extending it to other combinatorial optimization problems by providing a dataset class and a reward function. The repository also details dependencies and provides performance results for both TSP and sorting tasks, demonstrating its generalization capabilities.
embedded-scripting-languages
embedded-scripting-languages is a comprehensive, open-source resource offering a curated list of embedded scripting languages. This tool is designed to assist developers in selecting the most appropriate language for their specific application needs. The list includes a wide array of options, from reasonably mature to actively developed languages, and even extends to Datalog implementations. Each entry provides details such as the language's project name/link, implementation language, garbage collection method, and license, along with specific notes. The resource emphasizes languages with strong copyleft licenses as a warning, ensuring developers are aware of potential licensing implications. It's an invaluable reference for anyone looking to integrate scripting capabilities into their projects.
native_db
native_db is a fast, drop-in embedded database written in Rust, designed for multi-platform applications including server, desktop, and mobile. It simplifies data management by allowing effortless synchronization of Rust types and supports multiple indexes (primary, secondary, unique, non-unique, optional). The database boasts transparent serialization/deserialization using `native_model`, enabling compatibility with various serialization libraries like `bincode` or `postcard`. Key features include query type safety, automatic model migration, thread-safe and fully ACID-compliant transactions powered by `redb`, and real-time subscription capabilities with filters for insert, update, and delete operations. It is compatible with all Rust types and supports hot snapshots, making it a versatile solution for developers seeking an efficient embedded database.
Eagle
Eagle 2.5 is a family of frontier vision-language models (VLMs) developed by NVlabs, specifically engineered for long-context multimodal learning. Unlike many existing VLMs that focus on short-context tasks, Eagle 2.5 excels at challenges like long video comprehension and high-resolution image understanding, providing a generalist framework for both. It supports up to 512 video frames and is trained jointly on image and video data, including the novel Eagle-Video-110K dataset. Key innovations include Information-First Sampling for optimal image and text retention, Progressive Mixed Post-Training for enhanced context length processing, and Diversity-Driven Data Recipe. The model also features significant efficiency and framework optimizations, such as GPU memory optimization and inference acceleration, making it suitable for advanced research and development in multimodal AI.
I built a game where domain experts try to break frontier AI
R U Smarter? is a unique platform where human domain experts can challenge and expose the limitations of frontier AI models. Users submit expert-level questions that require nuanced judgment, not just textbook knowledge, to answer. Three frontier AI models then attempt to answer simultaneously. If the AI models fail to provide a correct response, experts can flag the failure and provide a detailed critique, which contributes to a permanent failure record. Verified failures, confirmed by five or more credentialed experts, result in a bonus payout for the submitting expert. The platform currently supports challenges in Medicine, Law, Finance, Trades, and Coding, providing a real-world testing ground for AI vulnerabilities.
Face_Pytorch
Face_Pytorch offers an open-source implementation of various face recognition algorithms within the PyTorch framework. This project includes well-known algorithms such as ArcFace, CosFace, and SphereFace, providing a comprehensive toolkit for researchers and developers. It supports data preparation for CNN training using datasets like CASIA-WebFace and Cleaned MS-Celeb-1M, aligned by MTCNN. The project also facilitates performance testing on benchmarks like LFW, AgeDB-30, CFP-FP, and MegaFace, with detailed verification results provided for different model types and protocols. It's designed for those looking to implement and evaluate face recognition models, offering flexibility for custom dataset paths and parameters.
facenet-pytorch
facenet-pytorch provides pretrained PyTorch models for both face detection using MTCNN and facial recognition with InceptionResnet (V1). These models are pretrained on extensive datasets like VGGFace2 and CASIA-Webface, offering high accuracy for various applications. The repository includes an efficient MTCNN implementation, noted for its speed, and allows for easy integration into Python projects. Developers can use these models for tasks such as complete detection and recognition pipelines, face tracking in video streams, and even finetuning with new data. The tool also offers performance comparisons with other face detection packages, highlighting its efficiency, especially with the FastMTCNN algorithm for video streams.
ExtremeNet
ExtremeNet is an open-source object detection system that employs a bottom-up approach to identify objects within images. It achieves this by detecting four extreme points (top-most, left-most, bottom-most, right-most) and one center point of objects using a standard keypoint estimation network. These five keypoints are then grouped into a bounding box if they are geometrically aligned. This method transforms object detection into a purely appearance-based keypoint estimation problem, bypassing region classification or implicit feature learning. The project is built upon the CornerNet code and integrates code from Deep Extreme Cut (DEXTR) for instance segmentation, allowing it to generate coarse octagonal masks and further refine them for improved Mask AP. It provides code for training, evaluation, and demo purposes, supporting benchmark evaluation on datasets like MS COCO.
Gen6D
Gen6D is an open-source project focused on generalizable model-free 6-DoF object pose estimation from RGB images. Developed for ECCV 2022, this tool allows users to estimate the 6-DoF poses of previously unseen objects. It comes with pretrained models and evaluation codes, enabling immediate use for various tasks. The project supports pose estimation on custom objects and provides comprehensive training codes for users who wish to fine-tune or train their own models. Key features include detection, viewpoint selection, and pose refinement, with intermediate and final qualitative results saved for analysis. The repository also details the process for creating GenMOP objects for evaluation and acknowledges contributions from several other open-source projects and datasets.
Qwen-VL
Qwen-VL, developed by Alibaba Cloud, is a powerful open-source large vision language model (LVLM) that accepts image, text, and bounding box inputs, and outputs text and bounding boxes. It offers strong performance, significantly surpassing existing open-sourced LVLMs on multiple English evaluation benchmarks. Key features include multi-lingual support for English, Chinese, and multi-lingual conversations, end-to-end recognition of bi-lingual text in images, and multi-image interleaved conversations. It is also the first generalist model to support grounding in Chinese, allowing for bounding box detection through open-domain language expression. The model boasts fine-grained recognition and understanding with a 448x448 resolution, promoting detailed text recognition and document QA.
prettygraph
prettygraph is a Python-based web application developed by @yoheinakajima, designed to demonstrate a new UI pattern for text-to-knowledge graph generation. While it's an experimental project and not intended as a robust framework, it provides a simple yet interactive way to visualize knowledge graphs. The application uses Flask for the backend, LiteLLM for generating predictions that transform text inputs into JSON formatted graph data, and Cytoscape.js for visualization. A key feature is its dynamic UI, where the graph regenerates and updates in real-time with each period insertion in the text input, offering color-coded nodes and edges for better visual distinction. It requires an OpenAI API key for operation.
PyTorch CV Backbones
PyTorch CV Backbones is a valuable resource for AI researchers and developers working with image models. This tool facilitates the retrieval of comprehensive information about PyTorch computer vision backbones, including their type, input size, and download URLs. Users can select between ImageNet V1 or V2 versions to fetch relevant model weights. The application presents this data in a clear, tabular format and offers the functionality to generate JSON output, streamlining the process of integrating model information into other workflows or projects. It's an open-source solution hosted on Hugging Face Spaces, making it easily accessible for the community.
renode
Renode, created by Antmicro, is an open-source simulation and virtual development framework designed for multi-node embedded networks, including both wired and wireless systems. It supports the development, testing, and debugging of unmodified software for IoT devices, offering a fast, cost-effective, and reliable solution. The tool simulates not only CPUs (ARMv7, ARMv8 Cortex-A/R/M, x86, RISC-V, SPARC, POWER, Xtensa, MSP430X) but also entire SoCs and connections between them, addressing complex scenarios. Renode integrates with the Robot testing framework for test case creation and execution. It can be run on various platforms, including Linux, macOS, and Windows, with portable packages, installers, and Docker images available. Commercial support is provided by Antmicro.