Coding & Development
Browsing page 86 of AI tools for Open Source & Models in Coding & Development. Sorted by confidence score — our independent quality rating.
KoAlpaca
KoAlpaca is an open-source language model designed to understand Korean instructions, building upon the Stanford Alpaca model's training methodology. It offers several models, including those based on Polyglot-ko (5.8B and 12.8B) for enhanced Korean performance and LLAMA (7B, 13B, 30B, 65B) for broader language capabilities. The project provides code examples for running the models via Huggingface Pipeline and Gradio, along with detailed instructions for dataset creation (v1.0 and v1.1), which involved translating Stanford Alpaca data and generating new data from Naver Q&A using ChatGPT. KoAlpaca aims to improve upon the original Alpaca's tendencies for short answers and lack of context understanding, particularly for Korean language tasks.
agent-zero
Agent Zero is an open-source AI framework designed for creating dynamic, organically growing, and learning AI agents. It is fully transparent, readable, comprehensible, customizable, and interactive, allowing users to define agent behavior through system prompts and message templates. Agents can use the operating system as a tool, writing their own code and using the terminal to create and utilize custom tools as needed. Key features include persistent memory, multi-agent cooperation with a superior-subordinate structure, and browser automation. The framework supports the open SKILL.md standard for portable agent capabilities, making it compatible with various AI models. It is fully Dockerized and offers a clean, interactive Web UI with real-time output streaming, making it suitable for both technical users and those focused on prompting and communication skills.
EVI Safety Technology
EVI Safety Technology offers AI-powered CCTV analytics and comprehensive investigation tools designed to enhance maritime safety and operational efficiency. The system continuously monitors vessel activity and crew movement, using AI to recognize unsafe behaviors, near-miss situations, and operational risks in real-time. It integrates with existing CCTV infrastructure, including IP cameras and NVRs, and operates fully offline at sea, syncing compressed data when connectivity is available. EVI supports a full investigation path by combining video evidence, crew input, and incident data, facilitating structured reporting for HSE performance reviews and fleet-wide safety improvements. It is suitable for various vessel types, including tankers, bulk carriers, and container ships, and helps meet maritime compliance and reporting standards like ISM, ISPS, and IMO requirements.
TokenOwl AI
TokenOwl AI offers an intelligent solution for managing cryptocurrency portfolios by leveraging artificial intelligence. Users can connect their wallets and exchanges, allowing the AI to automate and streamline various financial tasks. The platform focuses on making tax reporting easier through smarter transaction labeling and intuitive reconciliation processes. Additionally, TokenOwl AI includes an AI Assistant that can immediately answer user questions in natural language, providing a powerful tool for analysis and trading. This aims to simplify the complexities of crypto portfolio management and provide deeper insights for users.
swift-coreml-diffusers
Swift-coreml-diffusers is a native Swift UI application designed to showcase the integration of Apple's Core ML Stable Diffusion implementation. This open-source project simplifies the Stable Diffusion implementation from the diffusers library, making it an excellent resource for developers looking for sample code or faster iteration on their own projects. The application downloads a Core ML version of Stability AI's Stable Diffusion v2 base on first launch and utilizes a fast DPM-Solver++ scheduler, ported to Swift, for quicker inference. It supports models quantized with coremltools version 7 or better, requiring macOS 14 or iOS/iPadOS 17. The tool offers compatibility with macOS Ventura 13.1, iOS/iPadOS 16.2, and Xcode 14.2, with performance figures provided for various Apple devices.
Neurospace
Neurospace specializes in developing custom AI and machine learning solutions to help businesses improve efficiency and reduce costs. They offer services ranging from data engineering and strategy to MLOps and AI engineering. Their approach involves building secure data platforms, implementing predictive maintenance, and optimizing processes through machine learning. Neurospace emphasizes a collaborative approach, working closely with domain experts to ensure solutions are relevant and effective. They also focus on data security, ethical AI development, and compliance with regulations like the EU AI Act and NIS-2 directives. Additionally, Neurospace provides educational programs like AI Camp to help companies build internal knowledge around data and AI.
Tuned Lens
Tuned Lens is an AI tool developed by AlignmentResearch, available as a Hugging Face Space, designed to help users understand the internal workings of transformer models. It provides a visual interface to explore how these models process text, offering insights into their predictions and statistical computations. Users can input text and select various visualization options to observe the model's internal states and decision-making processes. This tool is particularly useful for AI researchers and machine learning engineers who need to debug models, analyze their behavior, and improve their performance by gaining a deeper understanding of their internal mechanisms.
Roadio
Roadio is an advanced camera system designed for two-wheeled vehicles like motorcycles, mopeds, and e-bikes, enhancing rider safety through AI-powered perception models. Its industry-leading AI interprets visual context to deliver 360° awareness and adaptive collision alerts in real-time. The system features an end-to-end AI model for Advanced Rider Assistance Systems (ARAS) on edge devices, capable of predicting hazards 5 seconds into the future and outperforming radar-based systems. Roadio supports a comprehensive suite of ARAS features, including Rear Collision Warning (RCW), Blind Spot Detection (BSD), Traffic Sign Recognition (TSR), and Forward Collision Warning (FCW). It supports 4K resolution for optimal AI performance and accurate capture of details, and is designed for seamless OEM integration with low processing power requirements.
Haven
Haven, through its Midrender platform, provides a visual editor for motion graphics powered by the open-source Revideo animation engine. Revideo is a TypeScript framework for programmatic video creation, enabling headless rendering, audio support, and a library-first API. Midrender brings these capabilities into a visual interface, enhanced with AI that can understand and edit compositions. Users can generate videos from code, deploy rendering as serverless functions, or build custom video editors. Midrender also supports MCP, allowing connection to agents like Claude Code or Cursor for motion content creation from a terminal. It's ideal for video ads at scale, automated social media content, and custom video tooling.
csghub
CSGHub is a brand-new open-source platform developed by the OpenCSG team for comprehensive management of Large Language Models (LLMs). It provides an efficient way to handle the entire lifecycle of LLMs and their associated assets, including datasets, spaces, and code. Users can upload, download, store, verify, and distribute LLM assets like DeepSeek and Llama via a web interface, Git command line, a natural language Chatbot, or the CSGHub SDK. The platform also features microservice submodules and standardized OpenAPIs for seamless integration. CSGHub aims to provide a user-friendly management platform specifically for LLMs, with the capability for on-premise deployment for secure, offline operation, essentially serving as a private, on-premise version of Hugging Face.
Hulu-Med
Hulu-Med is a transparent, open-source generalist model designed for holistic medical vision-language understanding. It unifies understanding across diverse modalities including medical text, 2D/3D images, and surgical videos. Built with a focus on transparency and accessibility, Hulu-Med achieves state-of-the-art performance on 30 medical benchmarks, trained entirely on public data. Key features include holistic multimodal understanding, a fully open-source pipeline, and efficient training. It supports 12 major anatomical systems and 14 medical imaging modalities, covering diverse downstream tasks like medical report generation and anomaly detection. The model is available in various parameter scales (4B to 235B) and is compatible with HuggingFace Transformers and vLLM for easier integration and faster inference.
Rawbot
Rawbot is a dedicated platform designed to simplify the complex process of comparing and evaluating various AI models. It serves as an ultimate AI comparison tool, enabling users to efficiently identify the most suitable AI models for their specific research, development, or business needs. The platform provides comprehensive insights into the strengths and weaknesses of different models, facilitating informed decision-making. By offering a streamlined approach to AI model selection, Rawbot helps users optimize their projects and achieve better outcomes, making it an invaluable resource for anyone working with artificial intelligence.
lightwood
Lightwood is an AutoML framework designed to streamline the machine learning (ML) lifecycle by allowing users to generate and customize ML pipelines through a declarative syntax called JSON-AI. It abstracts the ML pipeline into three core steps: pre-processing and data cleaning, feature engineering, and model building and training. Lightwood automatically identifies data types, performs cleaning, and splits data. It supports various data types including numbers, dates, categories, text, and multimedia, and offers a time-series mode. Users can override default behaviors, customize encoders, and integrate their own models, making it highly flexible for unique and custom ML tasks. The framework generates Python code from JSON-AI objects, enabling end-to-end training and prediction with pandas DataFrames.
llama3.np
llama3.np offers a pure NumPy implementation of the Llama 3 model, making it an excellent resource for researchers and developers interested in understanding the underlying architecture of large language models. The project was validated using the stories15M model trained by Andrej Karpathy, ensuring an accurate and reliable implementation. It provides a straightforward way to run the Llama 3 model using Python and NumPy, demonstrating the core mechanics without complex dependencies. This tool is particularly valuable for academic research and educational contexts, allowing for detailed exploration and experimentation with the Llama 3 model's components.
LION
LION (Latent Point Diffusion Models for 3D Shape Generation) is an open-source project presented at NeurIPS 2022, offering a robust framework for generating 3D shapes. This tool leverages advanced diffusion models to create 3D point clouds, enabling researchers and developers to explore and innovate in the field of 3D content creation. It includes functionalities for training VAE and diffusion prior models, with options for conditioning inputs like CLIP image embeddings for tasks such as single-view reconstruction or text-to-shape generation. The project provides detailed installation instructions, demo scripts, and evaluation tools, making it a valuable resource for those working with 3D shape synthesis and analysis.
LightReasoner
LightReasoner is an innovative open-source research tool that redefines how large language models (LLMs) acquire reasoning capabilities. It leverages small language models (SLMs) to strategically identify critical reasoning moments, allowing LLMs to focus their learning more efficiently. This approach achieves superior performance with remarkable token efficiency, reducing total training time by 90%, sampled problems by 80%, and tuned tokens by 99% compared to traditional Supervised Fine-Tuning (SFT). The framework consists of a three-stage process: critical step selection via Expert-Amateur KLD detection, contrastive supervision, and self-distillation. LightReasoner demonstrates that strategic token selection, rather than exhaustive training, is key to unlocking latent LLM reasoning potential, proving that smarter, not blindly harder, is the path to scalable AI improvement.
LightCompress
LightCompress is an open-source toolkit designed for compressing large AI models such as Large Language Models (LLMs), Vision-Language Models (VLMs), and video generative models. It offers a comprehensive suite of state-of-the-art compression algorithms, including various quantization methods (integer, floating-point, mixed-precision) and sparsity techniques (structured, unstructured). The tool supports a wide array of popular models like LLaMA, Mistral, and DeepSeekv2, and ensures compatibility with multiple inference backends such as VLLM, Sglang, and AutoAWQ. LightCompress aims to significantly reduce model size and improve inference efficiency while maintaining high accuracy, making it ideal for deploying large models on resource-constrained hardware.
DCRNN
DCRNN (Diffusion Convolutional Recurrent Neural Network) is an open-source project offering a TensorFlow implementation of the Diffusion Convolutional Recurrent Neural Network model. This tool is specifically designed for data-driven traffic forecasting, as detailed in the ICLR 2018 paper by Li et al. It allows users to prepare traffic data, construct graphs based on sensor networks, and train or run pre-trained models for prediction. The repository includes scripts for data preparation, graph generation, and model training on datasets like METR-LA and PEMS-BAY. Beyond traffic, DCRNN's variants have been applied to neuroimaging, air quality forecasting, and internet traffic forecasting, showcasing its versatility in spatiotemporal forecasting tasks.
d2l-pytorch
d2l-pytorch is an open-source project that meticulously reproduces the content of the acclaimed "Dive Into Deep Learning" book, translating its original MXNet code examples into PyTorch. This adaptation offers students and researchers a valuable resource for understanding and implementing deep learning concepts using the widely adopted PyTorch framework. The repository covers a comprehensive range of topics, from foundational preliminaries like data manipulation and linear algebra to advanced subjects such as convolutional neural networks, recurrent neural networks, attention mechanisms, and various optimization algorithms. It serves as a practical, hands-on guide for learning deep learning through code.
llm.pdf
llm.pdf is a proof-of-concept project showcasing the ability to run an entire Large Language Model (LLM) within a PDF file. This innovative approach leverages Emscripten to compile llama.cpp into asm.js, enabling the LLM to execute directly within the PDF environment through an old PDF JS injection method. The entire LLM file is embedded into the PDF using base64 encoding, allowing for self-contained LLM inference. While currently a proof-of-concept, it highlights the potential for highly portable and self-sufficient AI applications. Users can generate custom PDFs with compatible GGUF quantized models, with 135M parameter models taking approximately 5 seconds per token for input/output.
long-context-attention
long-context-attention, also known as Unified Sequence Parallelism (USP) or Hybrid Sequence Parallelism, offers a novel approach to training and inference for long context Large Language Models (LLMs). This open-source project synergizes the strengths of DeepSpeed-Ulysses-Attention and Ring-Attention, addressing their individual limitations. Ulysses-Attention is sensitive to the number of attention heads and less suitable for GQA/MQA scenarios, while Ring-Attention can be less efficient in computation and communication. LongContextAttention provides a more general, versatile, and performant solution. It supports various FlashAttention versions (v2, v3) and can even run without FlashAttention for NPUs. The tool includes functionalities for setting process groups, extracting local tensors, and offers different ring implementation types like 'zigzag' and 'basic'. It has been verified in Megatron-LM and applied in several other projects, providing a robust solution for researchers and developers working with long context generative AI.
DDNM
DDNM, or Denoising Diffusion Null-Space Model, is a cutting-edge AI tool for zero-shot image restoration, presented at ICLR 2023. It excels at solving a wide range of image restoration tasks, including super-resolution, denoising, colorization, inpainting, deblurring, and compressed sensing, all without requiring specific optimization or training. The tool supports arbitrary image sizes and offers both an SVD-based version for precise noisy tasks and a simplified version for flexible user-defined degradations. DDNM also provides functionalities for real-world applications like old photo restoration and enhancing degraded images, allowing users to define degradation operators and noise levels for customized results.
LookaheadDecoding
LookaheadDecoding is an open-source project designed to significantly accelerate Large Language Model (LLM) inference by breaking the traditional sequential dependency of token generation. This innovative approach utilizes a parallel decoding algorithm, eliminating the need for a draft model or a separate data store. Motivated by Jacobi decoding, LookaheadDecoding collects and caches n-grams from Jacobi iteration trajectories, enabling simultaneous processing of future tokens. The process is divided into a lookahead branch, which generates new n-grams within a defined window, and a verification branch, which validates promising candidates. This method has demonstrated substantial latency reductions, achieving speedups ranging from 1.5x to 2.3x on various datasets and models. The tool supports sampling and FlashAttention, and is implemented with an attention mask to maximize GPU parallel computing power, making it a valuable resource for optimizing LLM performance.
deepvoice3_pytorch
deepvoice3_pytorch provides a PyTorch implementation of convolutional neural networks for text-to-speech synthesis, based on the Deep Voice 3 architecture. It supports both multi-speaker and single-speaker models, offering pre-trained models and preprocessors for datasets like LJSpeech (English), JSUT (Japanese), and VCTK (English). The tool allows users to preprocess data, train models, and synthesize audio from text. It also includes features like guided attention, binary divergence for stable training, and support for custom datasets in JSON format. Users can monitor training progress with Tensorboard and utilize specific Git commits for compatibility with pre-trained models.