Coding & Development
Browsing page 95 of AI tools for Open Source & Models in Coding & Development. Sorted by confidence score — our independent quality rating.
Python
This GitHub repository, Tanu-N-Prabhu/Python, serves as a comprehensive Open Source resource for learning Python and Machine Learning. It caters to individuals ranging from novices to seasoned developers, offering a structured path to mastery. The repository includes materials on basic Python concepts, built-in functions, popular libraries like NumPy and Pandas, and various APIs such as Google Translate and Wikipedia. It also delves into Machine Learning foundations, supervised and unsupervised learning, neural networks, and MLOps. Additionally, it provides extensive Data Science materials, including EDA techniques and real-world data analysis questions with Python answers. The resource emphasizes practical application through hands-on exercises and real-world examples, making it ideal for those looking to enhance their coding journey.
python-machine-learning-book-2nd-edition
The python-machine-learning-book-2nd-edition repository serves as the official code and information resource for the second edition of the "Python Machine Learning" book. It provides comprehensive code examples, including Jupyter notebooks and Python scripts, for various machine learning algorithms and applications. Users can explore topics such as classification, dimensionality reduction, model evaluation, ensemble learning, sentiment analysis, regression, clustering, and deep learning with TensorFlow. The resource is ideal for students and professionals looking to implement machine learning concepts using Python, offering a practical, hands-on approach to learning.
python-ml-course
python-ml-course is an open-source educational resource designed to introduce individuals to Machine Learning using Python. The comprehensive course covers a wide range of topics, from basic Python installation and data preprocessing to advanced concepts like Deep Learning and Reinforcement Learning. It includes practical exercises, real-world datasets, and all source code on GitHub, making it suitable for hands-on learning. The course is taught by Juan Gabriel Gomila, a professional in Data Science, and aims to make complex mathematical theories and algorithms accessible. It caters to students, programmers, and data analysts looking to specialize or enhance their skills in the lucrative field of Data Science.
RemoteCLIP
RemoteCLIP is the official repository for the paper "RemoteCLIP: A Vision Language Foundation Model for Remote Sensing." This tool addresses limitations in existing remote sensing models by learning robust visual features with rich semantics and aligned text embeddings, crucial for retrieval and zero-shot applications. It leverages data scaling and conversion of heterogeneous annotations, incorporating UAV imagery to create a significantly larger pre-training dataset. RemoteCLIP supports diverse downstream tasks including zero-shot image classification, linear probing, k-NN classification, few-shot classification, image-text retrieval, and object counting, consistently outperforming baseline foundation models across various scales and datasets.
Physics-Informed-Neural-Networks
Physics-Informed-Neural-Networks (PINNs) is a research repository dedicated to investigating and implementing PINNs for solving Partial Differential Equations (PDEs). It integrates the physics of the PDE and boundary conditions directly into the neural network's loss function, utilizing the Mean-Squared Error of the PDE and boundary residual measured on 'collocation points'. The repository currently offers implementations for Burgers' and Helmholtz PDEs in both TensorFlow 2 and PyTorch. It also explores various aspects of PINNs, including the effectiveness of the L-BFGS optimizer for stiff PDEs, bottom-up learning mechanisms, and the impact of transfer learning on solution error, providing valuable insights for researchers and practitioners in scientific computing.
practical-machine-learning-with-python
Practical Machine Learning with Python offers a structured and comprehensive three-tiered approach to learning machine learning and deep learning. This resource, based on a book, is packed with over 500 pages of useful information, helping readers master essential skills to recognize and solve complex problems with a data-driven mindset. It uses real-world case studies and leverages the popular Python Machine Learning ecosystem, including frameworks like scikit-learn, pandas, statsmodels, spaCy, nltk, gensim, tensorflow, and keras. The content covers machine learning concepts, the Python ecosystem, standard pipelines, and real-world case studies across diverse domains like retail, finance, and computer vision, making it ideal for practitioners.
PointMamba
PointMamba is an open-source state space model (SSM) specifically designed for point cloud analysis, leveraging the success of Mamba from natural language processing. Unlike traditional Transformers, PointMamba employs a linear complexity algorithm, enabling global modeling while substantially reducing computational costs and GPU memory usage. This tool utilizes space-filling curves for efficient point tokenization and features a simple, non-hierarchical Mamba encoder as its backbone. Comprehensive evaluations demonstrate its superior performance across various datasets, making it a valuable resource for researchers and developers in 3D vision. PointMamba underscores the potential of SSMs in 3D vision-related tasks and provides a robust baseline for future research.
rag-tutorial-v2
rag-tutorial-v2 is an open-source tutorial designed to guide users through the process of building Retrieval Augmented Generation (RAG) systems. This improved version (v2) focuses on practical implementation, incorporating local LLMs for enhanced privacy and control, and demonstrating effective database update strategies. The tutorial also emphasizes robust testing methodologies to ensure the reliability and performance of the RAG system. It's a valuable resource for developers and researchers looking to understand and implement advanced RAG techniques, offering a hands-on approach to integrating LLMs with external knowledge bases.
promptbench
PromptBench is a PyTorch-based Python package designed as a unified evaluation framework for large language models (LLMs). It offers user-friendly APIs for researchers and developers to conduct comprehensive evaluations of LLMs, including quick performance assessments, prompt engineering method testing (like Chain-of-Thought, Emotion Prompt, and Expert Prompting), and adversarial prompt robustness analysis. The framework integrates dynamic evaluation techniques such as DyVal to mitigate test data contamination and efficient multi-prompt evaluation with PromptEval. It supports a wide range of language and multi-modal datasets and models, both open-source and proprietary, making it a versatile tool for understanding and benchmarking LLM capabilities.
StableDiffusionReconstruction
StableDiffusionReconstruction is a research-oriented tool designed for reconstructing visual experiences directly from human brain activity. Utilizing Stable Diffusion models, it allows for the generation of high-resolution images based on neural data. The project, stemming from research by Takagi and Nishimoto presented at CVPR 2023, also incorporates advanced decoding techniques. These include methods for decoding text prompts from brain activity, integrating GANs for improved image quality, and incorporating decoded depth information, significantly enhancing reconstruction accuracy. This repository provides the necessary code and instructions for reproducing these methods, making it a valuable resource for researchers in neuroscience and AI.
subgen
Subgen is an open-source tool designed to automatically generate subtitles (.srt or .lrc) for audio and video files using the OpenAI Whisper model. It supports both transcription of non-English languages and translation into English. The tool seamlessly integrates with various media servers, including Plex, Emby, Jellyfin, Tautulli, and Bazarr, allowing for webhook-triggered subtitle generation when new media is added or played. Utilizing stable-ts and faster-whisper, Subgen supports both CPU and Nvidia GPU (CUDA) processing, offering flexibility for different hardware setups. It addresses the common issue of missing or out-of-sync subtitles, providing a local solution for highly accurate subtitle creation.
ruby-fann
ruby-fann is a Ruby Gem designed to interface with the FANN (Fast Artificial Neural Network) library, allowing Ruby and Rails developers to integrate neural network capabilities into their applications. This open-source library supports the implementation of both fully-connected and sparsely-connected artificial neural networks. It is lauded for its ease of use, versatility, and speed, with most of the heavy lifting performed natively. The gem provides functionalities for training neural networks with custom data, saving and loading trained networks, and implementing custom training procedures via callback methods, making it a robust solution for AI application development in Ruby environments.
SparkNet
SparkNet is an open-source framework designed for building and training distributed neural networks using Apache Spark. It allows users to leverage the power of Spark for scalable AI model development, particularly beneficial for handling large datasets. The framework provides functionalities for quick cluster setup on EC2, training models like Cifar and ImageNet, and installing SparkNet on existing Spark clusters. It supports GPU acceleration with CUDA and offers pre-built JavaCPP binaries for various platforms, making it a robust solution for data scientists and machine learning engineers working with distributed computing environments.
Show-1
Show-1 is an advanced open-source text-to-video generation model developed by Show Lab at the National University of Singapore. It uniquely combines pixel and latent diffusion models to create videos from textual descriptions. The tool provides access to various model weights, including a base model, an interpolation model, and super-resolution models, which can be downloaded from HuggingFace. Users can generate videos by running a Python script, with outputs saved in GIF format. Show-1 also offers a Gradio demo for local use and has been accepted to IJCV, highlighting its academic recognition. It is designed for researchers and developers interested in cutting-edge video synthesis.
sdupdates
sdupdates is a mega collection of resources and news specifically curated for Stable Diffusion enthusiasts, with a strong focus on AUTOMATIC1111's webui. This GitHub repository serves as a central hub for staying updated on the latest developments, models, and techniques within the Stable Diffusion ecosystem. It includes links to various resources such as new models like Stable Diffusion v2-1-unCLIP and Kandinsky 2.1, ControlNet updates, and text-to-video advancements. The repository also provides practical instructions for updating the webui on both Windows and Linux, and offers contact information for contributions or questions. It's an invaluable resource for anyone looking to deepen their understanding and practical application of Stable Diffusion.
Static-to-Dynamic-LLMEval
Static-to-Dynamic-LLMEval is the official GitHub repository for a paper detailing recent advances in large language model benchmarks, specifically focusing on data contamination. The project conducts an in-depth analysis of existing static-to-dynamic benchmarking methods designed to reduce data contamination risks. It examines methods that enhance static benchmarks, identifies their limitations, and highlights the critical gap in standardized criteria for evaluating dynamic benchmarks. The repository proposes optimal design principles for dynamic benchmarking and analyzes the limitations of current dynamic benchmarks, offering a comprehensive overview of advancements in data contamination research and guiding future efforts.
system-prompts-and-models-of-ai-tools
system-prompts-and-models-of-ai-tools is a comprehensive open-source GitHub repository that curates system prompts, internal tools, and AI models from a wide array of AI applications. This resource is invaluable for developers, researchers, and AI enthusiasts looking to understand the underlying mechanics and prompt engineering strategies of popular tools like Augment Code, Claude Code, Cursor, Devin AI, NotionAI, Perplexity, and many others. It provides a centralized location to explore how different AI systems are structured and prompted, fostering learning and innovation in the AI development community. The repository also highlights the importance of securing AI systems against prompt injection and extraction risks.
tabm
TabM is an official open-source repository for the paper "TabM: Advancing Tabular Deep Learning With Parameter-Efficient Ensembling" (ICLR 2025). It offers a PyTorch-based Python package for implementing the TabM model, along with layers and tools for constructing custom architectures that efficiently ensemble MLP-like models. The tool is designed to improve performance on challenging tabular benchmarks like TabReD and has been successfully applied in Kaggle competitions. TabM is noted for its efficiency, being faster than prior tabular deep learning methods and capable of handling large datasets up to 100M+ objects. It allows for parallel training and weight sharing among MLPs, leading to better runtime, memory efficiency, and task performance.
stanford_dl_ex
stanford_dl_ex is a repository offering programming exercises for the Stanford Unsupervised Feature Learning and Deep Learning Tutorial. It provides starter code designed to help users engage with and practice the concepts taught in the official Stanford tutorial, available at ufldl.stanford.edu/tutorial. This resource is particularly useful for individuals looking to deepen their understanding and practical application of deep learning principles through hands-on coding. The repository includes various modules covering different aspects of deep learning, such as convolutional neural networks (CNN), principal component analysis (PCA), and sparse autoencoders (STL). It serves as a valuable, free educational tool for students and researchers alike.
TransformerEngine
Transformer Engine (TE) is an open-source library developed by NVIDIA for significantly accelerating Transformer models on NVIDIA GPUs. It achieves this by leveraging 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada, and Blackwell GPUs, including MXFP8 and NVFP4 formats on Blackwell. This results in improved performance and reduced memory utilization during both training and inference processes. TE provides highly optimized building blocks for popular Transformer architectures and an automatic mixed precision-like API that integrates seamlessly with existing framework-specific code. It also offers a framework-agnostic C++ API for broader integration, simplifying mixed-precision training for users by internally managing scaling factors.
treequest
TreeQuest is an open-source Python library designed for advanced tree search algorithms, particularly useful for scaling Large Language Model (LLM) inference. It offers a flexible API that allows for customizable node generation and scoring logic, making it adaptable to various applications. The library implements AB-MCTS-A (ABMCTS with Node Aggregation) and AB-MCTS-M (ABMCTS with Mixed Models) algorithms, as well as Multi-LLM AB-MCTS support. Key features include checkpointing and resuming searches, an ask-tell interface for batched sampling, and visualization utilities to render search trees. TreeQuest is ideal for developers and researchers working on optimizing LLM performance and exploring complex decision-making processes.
Trending-Deep-Learning
Trending-Deep-Learning is a GitHub repository that provides a curated list of the top 100 trending deep learning projects. This resource is updated regularly and sorts repositories based on the number of stars they gained on a specific day. It leverages the GitHub search API with a comprehensive query including terms like 'deep-learning', 'CNN', 'RNN', 'convolutional neural network', and 'recurrent neural network'. Repositories with 40,000 stars or more are excluded to focus on emerging trends. This tool is ideal for researchers, developers, and students looking to stay updated on the latest advancements and popular projects within the deep learning community, offering a quick overview of what's gaining traction.
Urban-Sound-Classification
Urban-Sound-Classification is an open-source deep learning project designed for the classification of urban sounds. It offers a comprehensive set of Jupyter notebooks demonstrating various neural network architectures, including feedforward, convolutional, and recurrent neural networks. The project is built using Python 3.5 (or above) and leverages popular libraries such as Tensorflow 2.x, Numpy, Matplotlib, and Librosa. It primarily uses the UrbanSound8k dataset for model training, with Google's AudioSet suggested as an alternative. This tool is ideal for researchers, students, and developers interested in deep learning applications for audio analysis and sound classification, providing a practical foundation for understanding and implementing these techniques.
Top-Deep-Learning
Top-Deep-Learning is an open-source project that compiles and ranks the top 200 deep learning GitHub repositories. The list is meticulously sorted by the number of stars each repository has received, offering a clear indicator of popularity and community engagement. This resource is invaluable for anyone looking to explore the most influential and actively developed projects within the deep learning domain. It is regularly updated to ensure the information remains current, reflecting the dynamic nature of deep learning research and development. The project's methodology involves querying the GitHub search API using terms like 'deep-learning', 'CNN', 'RNN', 'convolutional neural network', and 'recurrent neural network' to gather comprehensive results.