Coding & Development
Browsing page 180 of AI tools for Open Source & Models in Coding & Development. Sorted by confidence score — our independent quality rating.
openarm
OpenArm is a fully open-source 7DOF humanoid arm specifically engineered for physical AI research and deployment, particularly in contact-rich environments. Its design emphasizes high backdrivability and compliance, making it suitable for safe human-robot interaction while still providing practical payload capabilities for real-world applications. The arm features human-scale proportions and is available as a complete bimanual system for $6,500 USD, offering a flexible platform for teleoperation, imitation learning, simulation, and real-world data collection. OpenArm is under continuous development, actively seeking contributors, research partners, and company collaborators to advance practical humanoid systems.
robomimic
robomimic is a comprehensive, modular framework designed for robot learning from demonstration. It offers a wide array of demonstration datasets specifically collected for robot manipulation domains, alongside robust offline learning algorithms to effectively learn from these datasets. The primary goal of robomimic is to enhance the accessibility and reproducibility of robot learning research, enabling researchers and practitioners to benchmark tasks and algorithms consistently. This framework facilitates the development of the next generation of robot learning algorithms, supporting features like Diffusion Policy, multi-dataset training, language-conditioned policies, and integration with robosuite and DeepMind MuJoCo bindings. It also supports various observation modalities, pre-trained image representations, and logging with wandb.
SimpleVLA-RL
SimpleVLA-RL is an open-source reinforcement learning (RL) framework designed to efficiently scale the training of Vision-Language-Action (VLA) models. It provides an end-to-end RL pipeline built on veRL, incorporating VLA-specific optimizations such as multi-environment parallel rendering for accelerated trajectory sampling. The framework leverages state-of-the-art infrastructure for efficient distributed training, hybrid communication patterns, and optimized memory management. SimpleVLA-RL supports various VLA models like OpenVLA and OpenVLA-OFT, and benchmarks including LIBERO and RoboTwin 1.0/2.0. It emphasizes minimal reward engineering with binary outcome rewards and includes exploration strategies like dynamic sampling and adaptive clipping. The modular architecture allows for easy integration of new VLA models, benchmarks, and RL algorithms, making it a powerful tool for researchers and developers in the field.
SensorsCalibration
SensorsCalibration, also known as OpenCalib, is a comprehensive open-source toolbox designed for multi-sensor calibration in autonomous driving applications. Accurate sensor calibration is a foundational requirement for any autonomous system, enabling precise sensor fusion and subsequent processing steps like obstacle detection, localization, mapping, and control. This toolbox addresses the critical need for reliable calibration of various sensors, including IMU, LiDAR, Camera, and Radar. It offers both road scene-based calibration tools for parameters like camera intrinsics, lidar2imu, and surround-camera, as well as factory calibration tools supporting different board types such as chessboard, circle board, and Apriltag board. Additionally, it includes SensorX2car for online calibration of sensor-to-car coordinate systems.
SelfExSR
SelfExSR is a research code implementation for single image super-resolution, based on the paper "Single Image Super-Resolution from Transformed Self-Exemplars" (CVPR 2015). This algorithm stands out by achieving state-of-the-art performance in image super-resolution without requiring any external training dataset, complex feature extraction, or complicated learning algorithms. It operates by learning from transformed self-exemplars within the image itself. The repository provides the MATLAB source code, testing images for various datasets (Set5, Set14, Urban 100, BSD 100, Sun-Hays 80), and precomputed results for comparison with other state-of-the-art methods. While designed as educational code and not optimized for speed, users can adjust iteration numbers for a trade-off between speed and visual quality.
SuGaR
SuGaR (Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering) is a PyTorch implementation designed to extract precise and extremely fast meshes from 3D Gaussian Splatting reconstructions. It introduces a regularization term that aligns 3D Gaussians with the scene's surface, allowing for efficient point sampling and mesh extraction using Poisson reconstruction. This method preserves details and is significantly faster than traditional Neural SDFs. SuGaR also offers an optional refinement strategy that binds Gaussians to the mesh surface, enabling joint optimization for easy editing, sculpting, rigging, and animation in traditional software like Blender, Unity, or Unreal Engine. This allows users to retrieve an editable mesh for realistic rendering within minutes, offering superior rendering quality compared to state-of-the-art methods.
spring-boot-rest-example
spring-boot-rest-example is a sample Java/Maven/Spring Boot application designed to serve as a starter for building microservices. It implements REST APIs using Spring Boot, an in-memory H2 database, and an embedded Tomcat server. The project demonstrates full integration with the Spring Framework, including inversion of control and dependency injection. It comes with built-in health checks, metrics, and other operational endpoints via the Actuator module. The application also showcases Swagger2 for API documentation, Spring Data JPA/Hibernate for data persistence, and MockMVC for testing. It's easily configurable to work with other relational databases like MySQL or PostgreSQL.
super-resolution
This open-source project provides a Tensorflow 2.x based implementation of state-of-the-art models for single image super-resolution, including Enhanced Deep Residual Networks (EDSR), Wide Activation for Efficient and Accurate Image Super-Resolution (WDSR), and Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network (SRGAN). It offers a high-level training API, enabling users to train models as described in the respective papers and fine-tune EDSR and WDSR models within an SRGAN context. The tool includes a DIV2K data provider for automatic dataset downloads and offers pre-trained weights for quick setup. It's ideal for developers and researchers working on image processing and computer vision tasks.
sphereface
SphereFace offers a comprehensive open-source implementation of the SphereFace algorithm, a deep hypersphere embedding method for face recognition. This tool provides a full pipeline covering face detection, alignment, and recognition, making it valuable for researchers and developers in computer vision. It includes detailed instructions for installation and usage, demonstrating how to train models on datasets like CASIA-WebFace and evaluate performance on LFW. The repository also features various network architectures, including SphereFace-20, and highlights its state-of-the-art verification performance in challenges like MegaFace. Additionally, it provides insights into the underlying mathematical concepts and practical considerations for training, such as gradient normalization and convergence difficulties, along with links to third-party re-implementations and related angular margin learning resources.
SSL4MIS
SSL4MIS (Semi Supervised Learning for Medical Image Segmentation) is a comprehensive resource for researchers and developers focusing on medical image analysis. It offers a curated collection of literature reviews and practical code implementations for semi-supervised learning techniques. The repository includes re-implementations of various semi-supervised methods such as Mean Teacher, Entropy Minimization, and FixMatch, adapted for medical image segmentation. Additionally, it supports a range of 2D and 3D backbone networks like UNet, nnUNet, and Swin-UNet. This project aims to establish a benchmark for semi-supervised medical image segmentation, fostering easier evaluation and fair comparison within the medical image computing community. It also covers active learning and source-free domain adaptation for medical image analysis.
synthetic-computer-vision
synthetic-computer-vision is a GitHub repository dedicated to tracking and organizing resources related to the use of synthetic images in computer vision research. It serves as a valuable hub for researchers, offering a curated list of synthetic datasets such as SunCG, Minos, and Synthia, alongside various tools like AirSim, CARLA, and UnrealCV. The repository also includes a collection of relevant academic publications, categorized by year, with links to papers, code, and project pages. Users are encouraged to contribute by adding missing works or updating existing information through pull requests, making it a collaborative and up-to-date resource for the computer vision community.
tensorflow-yolo
tensorflow-yolo offers a TensorFlow-based implementation of the YOLO (You Only Look Once) real-time object detection system. This open-source project allows developers and researchers to train and test their own object detection models using TensorFlow 1.0. The repository includes instructions for downloading pre-trained models, setting up training data using Pascal-VOC2007, and converting custom data to the required text_record format. It provides the necessary tools and scripts for preprocessing data, configuring training parameters, and running demonstrations, making it a valuable resource for those working with real-time object detection.
tmrl
tmrl is a comprehensive open-source Python framework for training Deep Reinforcement Learning (RL) AIs in real-time applications, such as robotics, video games, and high-frequency control. It features a distributed architecture, enabling secure remote training and fine-grained customizability. The framework comes with a readily implemented example pipeline for the TrackMania 2020 racing video game, allowing users to train policies with state-of-the-art algorithms like Soft Actor-Critic (SAC) and Randomized Ensembled Double Q-Learning (REDQ). tmrl also provides a Gymnasium environment for TrackMania, making it easy to integrate into existing training frameworks. It supports both vision-based (CNN for raw images) and simpler rangefinder (MLP for LIDAR) observations, and offers analog control via a virtual gamepad.
WebWorldWind
WebWorldWind is an Open Source JavaScript SDK developed by NASA, with contributions from the European Space Agency, designed for creating geo-browser web applications. It allows developers to embed a 3D globe directly into HTML5 web pages, providing a geographic context with terrain and various shapes for displaying and interacting with geo-located information in both 3D and 2D. The SDK automatically retrieves high-resolution terrain and imagery from remote servers as needed, while also supporting custom terrain, imagery, 3D shapes, and position markings. Key features include improvements to COLLADA 3D model support, the ability to obtain click locations in 3D models, and enhanced Well-Known Text format support. It is licensed under the Apache License, Version 2.0.
yolov13
YOLOv13 is an open-source implementation for real-time object detection, leveraging hypergraph-enhanced adaptive visual perception. It introduces HyperACE for exploring high-order correlations between pixels in multi-scale feature maps and FullPAD for fine-grained information flow and representational synergy across the entire detection pipeline. The tool also incorporates model lightweighting via DS-based Blocks, replacing large-kernel convolutions with depthwise separable convolutions for faster inference without sacrificing accuracy. YOLOv13 is available in Nano, Small, Large, and X-Large variants, offering cutting-edge performance and efficiency for various object detection tasks. It supports deployment on platforms like Huawei Ascend and Rockchip, and includes a FastAPI REST API.
VTIL-Core
VTIL-Core, standing for Virtual-machine Translation Intermediate Language, is a set of tools built around an optimizing compiler. Its primary purpose is binary de-obfuscation and de-virtualization, making it a valuable asset for reverse engineering and security research. Unlike other optimizing compilers such as LLVM, VTIL features an extremely versatile Intermediate Language (IL) that simplifies lifting from various architectures, including stack machines. It maintains the native ISA's concepts like the stack, physical registers, and non-SSA architecture of a general-purpose CPU, allowing native instructions to be embedded within the IL stream and physical registers to be addressed freely. VTIL also facilitates code emission back into native formats at any virtual address without file format constraints. This repository contains the core components of the VTIL Project, with further documentation and an organization website planned for its initial release.
YOLO-Multi-Backbones-Attention
YOLO-Multi-Backbones-Attention is an open-source project designed to improve the efficiency and performance of YOLOv3 for object detection tasks. It integrates several lightweight backbones, including ShuffleNetV2, GhostNet, and VoVNet, to reduce model size and computational cost. The tool also incorporates various attention mechanisms like SE Block, CBAM Block, and ECA Block to enhance detection accuracy. Furthermore, it provides functionalities for model compression through pruning, quantization (including Dorefa for arbitrary bit quantization), and distillation, making it suitable for deployment on resource-constrained devices. The repository includes training and detection scripts, along with pre-trained models and support for multiple datasets such such as Visdrone and Bdd100K.
Zero Shot Text Classification
Zero Shot Text Classification is an AI tool hosted on Hugging Face Spaces by datasciencedojo, designed for classifying text into predefined categories without requiring specific training data for those categories. Users can easily input a piece of text and provide a list of candidate labels or categories. The tool then processes the input and returns a score for each category, indicating how well the text fits into that particular classification. This makes it a highly flexible and efficient solution for quick text categorization tasks, eliminating the need for extensive dataset preparation and model training.
Weavel
Weavel, Inc. is developing Typa, an innovative storytelling platform tailored for the needs of contemporary companies. While specific features are not detailed, the platform is positioned to help businesses create and disseminate their stories, suggesting capabilities related to content creation, narrative structuring, and potentially audience engagement. The company, a YC S24 alumnus, is focused on empowering modern enterprises to communicate their brand and vision through compelling narratives. This tool is likely to cater to businesses looking to enhance their marketing, public relations, or internal communications through advanced storytelling techniques.
android-fat-aar
android-fat-aar is a Gradle script designed for Android developers to create "fat" AAR files. This tool enables the merging and embedding of project dependencies directly into the generated AAR, streamlining the distribution of complex libraries. It addresses the challenge of maintaining a modular project structure while publishing a single, comprehensive library. A key benefit is the ability to apply ProGuard to the combined code, which is more effective than processing individual subprojects. While it supports single build types (release out of the box) and offers methods to prevent transitive dependency issues, users should note that manifest placeholders and AIDL file merging are not supported. The project is open-source and available on GitHub.
baselines
OpenAI Baselines offers a collection of high-quality, open-source implementations of various reinforcement learning algorithms. This project is designed to facilitate research by providing reliable baselines for comparison and further development. It supports Python 3.5+ and integrates with TensorFlow versions 1.4 to 1.14 (with a separate branch for TensorFlow 2.0). Researchers can use Baselines to train models for tasks like controlling MuJoCo humanoids or playing Atari games, with options for saving, loading, and visualizing trained models. The project emphasizes reproducibility and provides tools for logging and visualizing learning curves.
chatgptProxyAPI
chatgptProxyAPI is an open-source solution designed to facilitate access to OpenAI's API, particularly in environments with network restrictions. By leveraging Cloudflare Workers, it allows users to set up a free proxy for api.openai.com, ensuring seamless connectivity and supporting streaming output. The tool offers detailed instructions for deployment, including options for Cloudflare Pages for API proxying and OpenAI API balance queries, as well as Docker deployment for those with offshore VPS. It provides code examples for integrating the proxy with JavaScript, Python, and Node.js, making it accessible for developers to implement in their applications. This project is ideal for developers who need a reliable and free method to interact with OpenAI services without encountering network access issues.
Evilginx3-Phishlets
Evilginx3-Phishlets is an open-source repository offering a comprehensive collection of dynamic phishing templates specifically designed for use with Evilginx3. This resource is invaluable for penetration testers and red teams, providing them with finely crafted and updated phishlets suitable for real-world applications. The repository includes templates for various platforms like Amazon Web Services, Microsoft, Okta, Outlook, Spotify, and Twitter. It also features a detailed README that explains phishlet parameters such as `name`, `author`, `min_ver`, `proxy_hosts`, `sub_filters`, `auth_tokens`, `creds`, `auth_urls`, and `login`, enabling users to understand and customize their phishing campaigns effectively. The project emphasizes ethical use for cybersecurity professionals in controlled environments.
LFM2.5-VL-1.6B WebGPU
LFM2.5-VL-1.6B WebGPU offers in-browser vision-language inference using the LFM2.5-VL-1.6B model. This tool captures video directly from your webcam, processes each frame with an on-device vision-language model, and then provides a concise, one-sentence description of the visual content. Users can select their desired resolution for video capture, allowing for flexibility in performance and detail. Hosted on Hugging Face Spaces by LiquidAI, it provides a practical demonstration of real-time, local AI processing for visual understanding. This makes it an accessible resource for developers and researchers interested in exploring vision-language models without extensive setup.