Coding & Development
Browsing page 137 of AI tools for Open Source & Models in Coding & Development. Sorted by confidence score — our independent quality rating.
Picogen
Picogen, operating under the name Presidenslot, offers a platform for users to access demo slot games from providers like Pragmatic Play and PG Soft. It provides free access to these games with a credit of 100,000 IDR that can be refreshed without limits. This allows players to practice and test various slot patterns and strategies without using real money. The platform aims to replicate the real gaming experience, making it suitable for both beginners to understand game mechanics and experienced players to refine their tactics before playing with actual funds.
GLM-ASR
GLM-ASR-Nano is a robust, open-source speech recognition model featuring 1.5 billion parameters, designed to handle real-world complexities. It surpasses OpenAI Whisper V3 in multiple benchmarks while maintaining a compact size. Key capabilities include exceptional dialect support, particularly for Cantonese and other dialects, effectively bridging gaps in dialectal speech recognition. The model is also specifically trained for "Whisper/Quiet Speech" scenarios, accurately transcribing extremely low-volume audio that traditional models often miss. GLM-ASR-Nano achieves a state-of-the-art average error rate of 4.10 among comparable open-source models, demonstrating significant advantages in Chinese benchmarks like Wenet Meeting and Aishell-1. It supports 17 languages with high usability, with specific optimizations for certain regions.
hamilton
Apache Hamilton is a lightweight Python library designed for creating directed acyclic graphs (DAGs) of data transformations. It enables data scientists and engineers to define testable, modular, and self-documenting dataflows that encode lineage, tracing, and metadata. The library is highly portable, running anywhere Python does, including scripts, notebooks, Airflow pipelines, and FastAPI servers. Hamilton emphasizes separation of concerns, allowing data scientists to focus on problem-solving while engineers manage production pipelines. It supports data and schema validation, built-in coding styles, and a plugin-based architecture for custom integrations. The Apache Hamilton UI provides automatic visualization, cataloging, and monitoring of execution, including data cataloging, dataset profiling, and execution tracking.
head-pose-estimation
Head-pose-estimation is an open-source project designed for real-time human head pose estimation. It leverages ONNX Runtime and OpenCV to perform its core functions. The process involves three main steps: first, a face detector identifies a human face within an image or video frame; second, a pre-trained deep learning model detects 68 facial landmarks; and finally, a PnP algorithm calculates the head pose based on these landmarks. This tool is ideal for developers and researchers working on applications requiring precise head movement and orientation analysis. It provides clear instructions for getting started, including prerequisites, installation steps, and how to run the application with video files or webcams. The project also offers guidance on retraining the model for custom needs.
graph-learn
Graph-Learn, formerly AliGraph, is a robust and distributed framework designed for the development and application of large-scale graph neural networks (GNNs). Developed by Alibaba, it has been successfully deployed in various industrial scenarios such as search recommendation, network security, and knowledge graphs. The framework offers a comprehensive solution encompassing both GNN training and online inference services. Its training component supports sampling on batch graphs and incremental GNN model training, compatible with TensorFlow and PyTorch. The online inference service, Dynamic-Graph-Service, ensures real-time sampling on dynamic graphs with streaming updates, boasting P99 latency within 20ms for large-scale graphs. It provides Python, C++, and Java interfaces for flexible integration.
HLearn
HLearn is a high-performance machine learning library developed in Haskell, designed to offer both speed comparable to low-level languages like C/C++ and flexibility akin to high-level languages such as Python. It distinguishes itself by leveraging functional programming principles and the SubHask library for fast numerical computations. The library's design is deeply rooted in abstract algebra, utilizing concepts like homomorphisms, monoids, and Abelian groups to enable features such as parallel batch training, online training, fast cross-validation, and weighted data points. HLearn also incorporates a unique History monad for debugging optimization procedures without runtime overhead. While it's a research project aiming for an optimal interface, its current focus is on foundational algebraic structures rather than a broad range of popular machine learning techniques.
ivy
Ivy is an open-source tool designed to facilitate the conversion of machine learning code between various popular frameworks. It enables developers to seamlessly transpile ML models, tools, and libraries, supporting conversions to and from PyTorch, TensorFlow, JAX, and NumPy. Key functionalities include `ivy.transpile()` for converting framework-specific code to a target framework, and `ivy.trace_graph()` for tracing efficient computational graphs. Ivy supports both eager and lazy transpilation, adapting to whether a class/function or a module is provided. This flexibility makes it a valuable resource for developers working in multi-framework environments, simplifying code portability and integration.
kubernetes-for-ml-engineers
kubernetes-for-ml-engineers offers a comprehensive, step-by-step guide for Machine Learning engineers to understand and implement basic Kubernetes concepts. The repository details how to install essential tools like Docker, Kind, and kubectl, and then walks users through creating a local Kubernetes cluster. It covers writing business logic for a simple FastAPI application, containerizing it with Docker, and subsequently building, running, and pushing the Docker image to the local Kubernetes cluster. Finally, the guide explains how to deploy the application as a Kubernetes service and test its functionality, making it an invaluable resource for those looking to deploy ML applications in a containerized environment.
LightNet
LightNet is an open-source project offering a collection of light-weight neural networks specifically designed for semantic image segmentation. It focuses on achieving high segmentation accuracy while maintaining computational efficiency, making it suitable for embedded devices often found in autonomous driving systems. The repository includes implementations of several architectures such as MobileNetV2Plus, RF-MobileNetV2Plus, MobileNetV2Vortex, MobileNetV2Share, Mixed-scale DenseNet, SE-WResNetV2, and ShuffleNetPlus. These models incorporate techniques like Spatial-Channel Squeeze & Excitation (SCSE), Receptive Field Block (RFB), and Vortex Pooling. LightNet provides code in PyTorch and supports training and evaluation on Cityscapes and Mapillary Vistas Datasets, along with data augmentation using GANs.
Megatron LM
Megatron-LM is an NVIDIA-developed, GPU-optimized library designed for training large transformer models at scale. It comprises two main components: Megatron-LM, which offers pre-configured training scripts for research teams and quick experimentation, and Megatron Core, a composable library providing GPU-optimized building blocks for custom training frameworks. Megatron Core includes transformer building blocks, advanced parallelism strategies (TP, PP, DP, EP, CP), mixed precision support (FP16, BF16, FP8, FP4), and various model architectures. It's ideal for framework developers and ML engineers building custom training pipelines. The library also features Megatron Bridge for bidirectional Hugging Face ↔ Megatron checkpoint conversion, ensuring interoperability and production-ready recipes. It supports training models from 2B to 462B parameters across thousands of GPUs, achieving high Model FLOP Utilization (MFU).
MMSA
MMSA is a comprehensive, open-source framework designed for Multimodal Sentiment Analysis (MSA). It allows users to train, test, and compare various MSA models within a single, unified environment. The framework supports 15 different MSA models, including recent advancements, and integrates with three key MSA datasets: MOSI, MOSEI, and CH-SIMS. MMSA is highly accessible, providing both Python APIs for programmatic integration and command-line tools for quick experimentation and deployment. Users can also experiment with fully customized multimodal features using the MMSA-FET toolkit. The project is packaged for easy installation via PyPI, making it straightforward to get started with sentiment analysis tasks.
NVTabular
NVTabular is a powerful feature engineering and preprocessing library specifically designed for tabular data, enabling the manipulation of terabyte-scale datasets. It accelerates computation on the GPU using the RAPIDS Dask-cuDF library, making it ideal for training deep learning-based recommender systems. As a core component of NVIDIA Merlin, it seamlessly integrates with other Merlin tools like Merlin Models, HugeCTR, and Merlin Systems to provide end-to-end acceleration for recommender systems on the GPU. NVTabular addresses challenges such as processing huge datasets, managing complex data pipelines, and overcoming input bottlenecks, allowing data scientists and ML engineers to focus on data transformation rather than scaling issues. It significantly reduces the time required for feature engineering and preprocessing, with reported completion times of 13 minutes on a single V100 GPU and 3 minutes on a DGX-1 cluster for the Criteo 1TB Click Logs Dataset.
365-Days-Computer-Vision-Learning-Linkedin-Post
365-Days-Computer-Vision-Learning-Linkedin-Post is an open-source GitHub repository curated by Ashish Patel, offering a comprehensive, day-by-day learning journey through various computer vision concepts and models. Each entry in the repository corresponds to a LinkedIn post, providing a concise overview and a link to further resources on topics ranging from EfficientDet and YOLO Series to Vision Transformers, GANs, and advanced segmentation techniques. This resource is ideal for individuals looking to deepen their understanding of computer vision through a structured, accessible format, leveraging the power of community learning and readily available information.
3d-bat
3D-BAT (3D Bounding Box Annotation Tool) is an open-source, web-based platform designed for annotating 3D bounding boxes on point cloud and image data. It offers a comprehensive suite of features for efficient and accurate data labeling, including AI-assisted labeling, batch-mode editing, and interpolation for sequences. The tool supports full-surround annotations, 3D to 2D label transfer, automatic tracking, and various viewing options like side views and perspective/orthographic editing. With capabilities for custom dataset, class, and attribute support, along with HD map integration and OpenLABEL compatibility, 3D-BAT is ideal for researchers and developers working with multi-sensor data in fields like autonomous driving and robotics. It also includes features like auto-save, redo/undo, and keyboard-only annotation for a streamlined workflow.
One-DM
One-DM, or One-Shot Diffusion Mimicker, is an open-source AI tool designed for stylized handwritten text generation. It stands out by requiring only a single reference sample as style input to imitate a user's writing style and generate new handwritten text with arbitrary content. This addresses a common challenge in previous methods that struggled with accurate style extraction from limited samples. One-DM enhances style extraction by incorporating high-frequency components from the reference sample, effectively capturing writing patterns while suppressing background noise. Extensive experiments across English, Chinese, and Japanese handwriting datasets demonstrate its superior performance, even outperforming methods that use significantly more reference samples. The project provides code, datasets, and pre-trained models for easy setup and use.
pytorch-grad-cam
pytorch-grad-cam is an advanced AI explainability package for computer vision, built on PyTorch. It offers a comprehensive collection of Pixel Attribution methods, including GradCAM, HiResCAM, ScoreCAM, and many others, to help diagnose model predictions and understand their decision-making process. The tool supports a wide range of architectures, from common CNNs to Vision Transformers, and can be applied to advanced use cases such as classification, object detection, semantic segmentation, and embedding-similarity. It includes smoothing methods like `aug_smooth` and `eigen_smooth` to produce clearer CAMs, and boasts high performance with full support for batches of images. Additionally, pytorch-grad-cam provides metrics for evaluating the trustworthiness and performance of explanations, making it valuable for both model development and research into new explainability methods.
python-utcp
python-utcp is the official Python implementation of the Universal Tool Calling Protocol (UTCP), an open standard designed to allow AI agents to call any API directly, eliminating the need for additional middleware. It emphasizes scalability, extensibility, and interoperability, supporting a wide range of communication protocols through a modular, plugin-based architecture. Developers can easily integrate new protocols like HTTP, SSE, CLI, and more, or add custom tool storage and search strategies. The protocol is built on simple, well-defined Pydantic models, making it straightforward for developers to implement and use. This repository provides the core UTCP package, along with various protocol-specific plugins, and offers clear migration guides and usage examples for quick adoption.
pymarl
PyMARL is a Python-based, open-source framework developed by WhiRL for deep multi-agent reinforcement learning. It provides implementations of several prominent algorithms, including QMIX for monotonic value function factorisation, COMA for counterfactual multi-agent policy gradients, VDN for value-decomposition networks, IQL for independent Q-learning, and QTRAN for learning to factorize with transformation. The framework is built using PyTorch and integrates with SMAC (StarCraft Multi-Agent Challenge) as its environment, specifically using SC2.4.6.2.69232 for the results in the SMAC paper. PyMARL supports saving and loading trained models, as well as watching StarCraft II replays, making it a comprehensive tool for researchers and developers in the multi-agent RL domain.
PoseFormer
PoseFormer is an open-source project that provides an official implementation of the paper "3D Human Pose Estimation with Spatial and Temporal Transformers," accepted at ICCV 2021. This tool is designed for researchers and developers working in the field of computer vision and human pose estimation. It offers code built on VideoPose3D, allowing users to evaluate pre-trained models with both CPN detected and ground truth 2D poses as input. Additionally, PoseFormer supports training new models from scratch, with configurable frame inputs to achieve varying levels of accuracy. The repository also links to related works like Context-Aware PoseFormer (NeurIPS 2023) and PoseFormerV2 (CVPR 2023), indicating ongoing research and development in this area.
RQ-VAE-Recommender
RQ-VAE-Recommender offers a PyTorch implementation of a generative retrieval model, specifically designed for recommender systems. The model operates in two stages: first, it maps items in a corpus to a tuple of semantic IDs by training an RQ-VAE. Second, it tokenizes sequences of these semantic IDs using a frozen RQ-VAE and then trains a transformer-based model to predict the next IDs in the sequence. This approach is based on the research presented in "Recommender Systems with Generative Retrieval." It supports various datasets, including Amazon Reviews (Beauty, Sports, Toys), MovieLens 1M, and MovieLens 32M, and provides both RQ-VAE and decoder-only retrieval model training scripts. Pre-trained checkpoints are available on Hugging Face for Amazon Beauty.
Reproducible-Deep-Compressive-Sensing
Reproducible-Deep-Compressive-Sensing is a comprehensive collection of source code dedicated to deep learning-based compressive sensing (DCS). This repository categorizes and provides access to numerous research works, offering links to their respective source code, PDF papers, and DOIs. The collection is organized based on key characteristics such as sampling matrix type (frame-based/block-based), sampling scale (single scale, multi-scale), and the deep learning platform used. It also includes code for image and video reconstruction, as well as other related applications. This resource is invaluable for researchers and developers looking to explore, reproduce, or build upon existing deep learning models in compressive sensing.
snorkel
Snorkel is an open-source system designed for the rapid generation of training data using weak supervision. Originating from Stanford in 2015, the project aimed to bring mathematical and systems structure to the often manual process of training data creation. It empowers users to programmatically label, build, and manage training data, addressing the critical role of data quality in machine learning project success. While the original Snorkel project is no longer actively developed, its core ideas and techniques have evolved into Snorkel Flow, an end-to-end AI application development platform. Snorkel is particularly useful for developers and data scientists looking to efficiently create large, labeled datasets for various machine learning tasks.
Chatterbox Labs
Red Hat is a leading provider of enterprise open source solutions, offering a comprehensive suite of technologies for Linux, cloud, containerization, and Kubernetes. The platform supports hybrid cloud innovation with Red Hat Enterprise Linux and enables scalable application development with Red Hat OpenShift. For AI, Red Hat offers specialized products like Red Hat AI Enterprise, Red Hat AI Inference Server, and Red Hat OpenShift AI, designed to help businesses build, deploy, and monitor AI models and applications efficiently. The company emphasizes a community-powered approach, collaborating with open source communities to develop secure, stable, and innovative technologies, and provides extensive support, training, and consulting services.
Data Wizards
Data Wizards is an AI consulting firm specializing in helping corporates and ambitious SMEs unlock their business potential through expert AI solutions. They provide comprehensive services including AI strategy development, AI solution and development, and AI education. Data Wizards builds high-performing AI solutions to overcome challenges, streamline operations, and identify new growth opportunities. Their expertise spans various industries such as Automotive, Retail, Pharmaceutical, Manufacturing, Insurance, Financial, Logistics, Energy, Healthcare, Telecommunications, Media, SMEs, Security, Commodity, and Food, offering tailored applications like predictive maintenance, sales forecasts, customer churn analysis, and fraud detection.