Research & Education
Browsing page 18 of AI tools for Scientific Computing in Research & Education. Sorted by confidence score — our independent quality rating.
Compass Bioinformatics Inc.
Compass Bioinformatics Inc. presents InheriNext, an AI-powered platform designed to streamline next-generation sequencing (NGS) analysis for inherited diseases. InheriNext features a cutting-edge variant ranking algorithm, intuitive design, and validated accuracy, empowering scientists and clinicians in hospitals and research laboratories. The platform supports data from most automated sequencing instruments, offering secure and optimized pipelines for complex variant calling from FASTQ or VCF formats. Users can identify and rank causative variants based on clinical phenotypes, in silico gene panels, or customized filters. InheriNext also provides custom report functionality for inherited diseases, including summaries and support for categorized variants. It automates routine tasks with accuracy and efficiency, ensuring consistency through baked-in ACMG guidelines and offering transparency by revealing all supporting evidence for its conclusions.
SiClarity
SiClarity is enabling a new era of AI-Accelerated Electronic Design Automation (EDA) tools, focusing on machine learning and generative design solutions. It provides comprehensive capabilities for physical design, design, device, and process co-optimization, including 3D visualization of design and process. The tool features AI-enabled predictive parasitics for layout and device optimization, and supports hybrid bond and route aware platforms. SiClarity is fluent in advanced architectures like GAA and CFET, allowing engineers to quickly assess design merit. It supports both SOW engagements running models on SiClarity hardware and deployed licenses for customer hardware. The technology generates 3D models from layouts to refine predictions on capacitance and resistance, achieving results 200x faster than industry tools. It also optimizes transistor-level netlists for high-performance place and route solutions.
Wildflow
Wildflow is an AI-powered platform dedicated to the protection and restoration of Earth's ecosystems, with a primary focus on coral reefs. The tool leverages AI to analyze petabytes of nature data, model complex ecosystem dynamics, and coordinate precise actions for conservation. Users can upload footage from various cameras, including GoPros and drones, to reconstruct photorealistic 3D models of coral reefs. These models offer sub-centimeter resolution for scientifically accurate orthomosaics, bathymetry maps, and permanent records to track changes over time. Wildflow also provides key health metrics like structural complexity, outplant survival, benthic cover, and species identification, enabling data-driven decisions for restoration efforts. It helps optimize coral outplanting, select restoration sites, and track impact for donors and partners.
Vyasa Analytics
Certara.AI is a secure, scalable, and specialized AI platform tailored for the life sciences industry, designed to break down data silos and enhance analytical capabilities. Unlike generalized AI platforms, Certara.AI focuses on biomedical research and development, providing researchers with the tools to make evidence-based decisions across drug discovery, clinical trials, and regulatory submissions. It offers an AI model-agnostic approach, allowing deployment with tailored generative AI models or custom implementations. The platform ensures real-time data access through a flexible data fabric, enabling simultaneous search and analysis from multiple sources. Its adaptable architecture supports scalability across various data environments without disrupting existing infrastructure, making it a robust solution for complex life science data challenges.
wearm.ai
wearm.ai offers an innovative optical-based wearable solution designed for deep muscle motion analysis. This advanced tool leverages big motion data to create a precise human body digital twin, providing detailed insights into movement patterns. It features an LLM AI interface, making complex analysis accessible and user-friendly. The technology is geared towards enhancing understanding of human motion, potentially benefiting fields such as sports science, rehabilitation, and fitness. By digitizing and analyzing muscle movements, wearm.ai aims to offer a comprehensive platform for motion capture and analysis, pushing the boundaries of wearable technology in health and wellness.
ImageProVision
ImageProVision specializes in advanced image processing and analytics, empowering pharmaceutical leaders, scientific researchers, and industrial giants to extract definitive insights from complex visual data. Their CLAIRITY™ Suite offers a range of AI/ML-powered solutions for tasks such as particle size and shape analysis (CLAIRITY™ PARTICLE, MORPHOWIZ), automated microscopy (CLAIRITY™ AUTO), nano-scale analysis (CLAIRITY™ NANO), and microbial colony counting (CLAIRITY™ MICROBE). The platform also supports capsule seam analysis, vial inspection, and cell analysis. ImageProVision's tools are designed to accelerate discovery, meet compliance standards, and enhance quality control across diverse applications.
llama3.np
llama3.np offers a pure NumPy implementation of the Llama 3 model, making it an excellent resource for researchers and developers interested in understanding the underlying architecture of large language models. The project was validated using the stories15M model trained by Andrej Karpathy, ensuring an accurate and reliable implementation. It provides a straightforward way to run the Llama 3 model using Python and NumPy, demonstrating the core mechanics without complex dependencies. This tool is particularly valuable for academic research and educational contexts, allowing for detailed exploration and experimentation with the Llama 3 model's components.
LightCompress
LightCompress is an open-source toolkit designed for compressing large AI models such as Large Language Models (LLMs), Vision-Language Models (VLMs), and video generative models. It offers a comprehensive suite of state-of-the-art compression algorithms, including various quantization methods (integer, floating-point, mixed-precision) and sparsity techniques (structured, unstructured). The tool supports a wide array of popular models like LLaMA, Mistral, and DeepSeekv2, and ensures compatibility with multiple inference backends such as VLLM, Sglang, and AutoAWQ. LightCompress aims to significantly reduce model size and improve inference efficiency while maintaining high accuracy, making it ideal for deploying large models on resource-constrained hardware.
llm.pdf
llm.pdf is a proof-of-concept project showcasing the ability to run an entire Large Language Model (LLM) within a PDF file. This innovative approach leverages Emscripten to compile llama.cpp into asm.js, enabling the LLM to execute directly within the PDF environment through an old PDF JS injection method. The entire LLM file is embedded into the PDF using base64 encoding, allowing for self-contained LLM inference. While currently a proof-of-concept, it highlights the potential for highly portable and self-sufficient AI applications. Users can generate custom PDFs with compatible GGUF quantized models, with 135M parameter models taking approximately 5 seconds per token for input/output.
Matterport3DSimulator
Matterport3DSimulator is an AI research platform designed for deep reinforcement learning, computer vision, natural language processing, and robotics. It allows AI agents to interact with real 3D environments using visual information derived from panoramic RGB-D images. The simulator is based on the Matterport3D dataset, featuring 90 diverse indoor environments. Key capabilities include outputting real RGB and depth images, customizable image resolution and camera parameters, and support for off-screen rendering. It offers both C++ and Python APIs and is highly efficient, capable of around 1000 fps RGB-D off-screen rendering. The platform also includes the Room-to-Room (R2R) navigation dataset and task for training agents to follow natural language instructions.
Data-Science-and-Machine-Learning-Projects-Dojo
Data-Science-and-Machine-Learning-Projects-Dojo is an open-source GitHub repository offering a comprehensive collection of data science, machine learning, deep learning, and data visualization projects. It serves as a practical dojo for individuals to practice and enhance their skills in these areas, covering theories, probability, and statistics. The projects utilize popular libraries such as NumPy, Pandas, Scikit-learn, TensorFlow, Keras, NLTK, Matplotlib, Seaborn, and Plotly. It also includes examples of turning ML models into web applications using Streamlit and Flask, and explores Apache Spark for large-scale data processing. The repository features diverse projects like breast cancer tumor diagnostics, movie rating analysis, customer churn prediction, heart disease prediction, bulldozer sale price prediction, and dog breed classification.
micro_diffusion
micro_diffusion is an open-source repository from Sony Research that provides a minimalistic implementation for training large-scale diffusion models from scratch with an extremely low budget. Utilizing only 37 million publicly available real and synthetic images, it can train a 1.16 billion parameter sparse transformer for approximately $1,890, achieving a strong FID score on the COCO dataset. The repository includes training code, dataset code, and pre-trained model checkpoints for off-the-shelf generation. It supports progressive training from low to high resolution and incorporates patch masking for performance optimization and reduced training time.
deepdrive
Deepdrive is an open-source simulator designed to facilitate experimentation and advancement in self-driving AI. It enables anyone with a PC to develop and test state-of-the-art autonomous driving systems within a realistic simulated environment. The simulator supports various AI agent types, including forward-agents, remote agents, and baseline agents like Mnet2 and C++ FSM/PID. Users can record training data for imitation learning, convert data to TFRecords, and train models using provided datasets or their own. Deepdrive offers detailed observation data, including vehicle dynamics, camera feeds (image, depth), and environmental information, all adhering to Unreal Engine conventions for units and rotations. It requires Linux, Python 3.6+, 10GB disk space, and 8GB RAM, with optional GPU requirements for baseline agents.
meshed-memory-transformer
Meshed-Memory Transformer (M²) is an open-source project that provides the reference code for the paper "Meshed-Memory Transformer for Image Captioning" presented at CVPR 2020. This tool is designed for researchers and developers working in computer vision and natural language processing. It allows users to set up a conda environment, download necessary data like COCO annotations and detection features, and then evaluate or train their own image captioning models. The repository includes scripts for both testing and training, with configurable arguments for batch size, number of memory vectors, and learning rate scheduling. It requires Python 3.6 and specific data preparation steps to function correctly.
DriveDreamer
DriveDreamer is a pioneering world model entirely derived from real-world driving scenarios, specifically designed for autonomous driving research. Unlike other models that focus on gaming or simulated environments, DriveDreamer addresses the critical limitation of lacking real-world representation. It leverages powerful diffusion models to construct comprehensive representations of complex driving environments and employs a two-stage training pipeline. This allows DriveDreamer to first acquire an understanding of structured traffic constraints and then anticipate future states. The tool empowers precise, controllable video generation that faithfully captures real-world traffic scenarios and enables the generation of realistic and reasonable driving policies, opening avenues for interaction and practical applications in autonomous driving.
DiffEqFlux.jl
DiffEqFlux.jl is a Julia library designed for scientific machine learning (SciML), specifically focusing on neural differential equations. It integrates differential equation solvers into neural networks, enabling the addition of physical information into traditional machine learning models. The library offers pre-built implicit layer architectures with efficient O(1) backpropagation and GPU acceleration. It supports various types of neural differential equations, including Neural ODEs, Neural SDEs, Neural DAEs, and Neural DDEs, as well as Hamiltonian Neural Networks and Continuous Normalizing Flows. DiffEqFlux.jl is built upon DifferentialEquations.jl and Lux.jl, providing a robust framework for researchers and developers to explore advanced scientific machine learning methods.
nucleotide-transformer
nucleotide-transformer is an open-source repository from InstaDeep AI, dedicated to advancing genomics and transcriptomics through cutting-edge deep learning models. It features a collection of transformer-based genomic language models and innovative downstream applications, including the Nucleotide Transformer (NT), Agro Nucleotide Transformer (AgroNT), SegmentNT, and ChatNT. The platform provides powerful, reproducible, and accessible tools for unlocking new insights from biological sequences, offering pre-trained weights, inference code, and research contributions. It supports various tasks such as functional-track prediction, genome annotation, controllable sequence generation, and single-cell transcriptomics, making it a central hub for AI-driven genomic research.
ENACOM Group
ENACOM Group specializes in developing and regionalizing advanced computational methods for various engineering applications. The company's core expertise lies in areas such as multiobjective optimization, machine learning, and data mining. ENACOM is committed to fostering technological development within Brazil, actively supporting research initiatives, and facilitating the transfer of technology to practical applications. While specific features are not detailed on their website, their focus suggests a strong emphasis on scientific and statistical analysis, likely catering to technical professionals in engineering and research fields.
e3nn
e3nn is an open-source, modular framework designed to facilitate the development of neural networks with Euclidean symmetry. It provides fundamental mathematical operations such as tensor products and spherical harmonics, essential for building E(3) equivariant neural networks. The library is under active development, with breaking changes indicated by version number increments. It is recommended to install using pip, and users can contribute to its development or seek help through discussions and bug reports on GitHub. The framework is backed by research papers on Euclidean Neural Networks and e3nn itself, with BibTeX entries available for citation.
finetrainers
finetrainers is a work-in-progress library from Hugging Face designed for scalable and memory-optimized training of diffusion models. It provides support for various commonly used training algorithms, including DDP, FSDP-2, HSDP, and CP. Key features include LoRA and full-rank finetuning, conditional control training, and memory-efficient single-GPU training. The library also supports multiple attention backends like flash, flex, sage, and xformers, along with auto-detection of common dataset formats. It's built to handle combined image/video datasets, multi-resolution bucketing, and offers memory-efficient precomputation. finetrainers is recommended for use with PyTorch 2.5.1 or above for optimal performance and reproducibility.
exllamav3
ExLlamaV3 is an inference library specifically designed for running Large Language Models (LLMs) locally on modern consumer-class GPUs. Its headline feature is the new EXL3 quantization format, which is based on QTIP from Cornell RelaxML, allowing for efficient model conversion in a single step. The library supports flexible tensor-parallel and expert-parallel inference setups, and provides an OpenAI-compatible server via TabbyAPI for local or remote inference. It also includes features like continuous, dynamic batching, HF Transformers plugin support, speculative decoding, and 2-8 bit cache quantization. ExLlamaV3 aims to make advanced quantization techniques more accessible and less resource-intensive, enabling users to run large models like Llama-3.1-70B with minimal VRAM.
R-KV
R-KV is a novel method for redundancy-aware KV cache compression specifically designed for large language models (LLMs) that rely on chain-of-thought (CoT) or self-reflection for reasoning tasks. It addresses the issue of bloated key-value (KV) caches during inference by ranking tokens on-the-fly for both importance and non-redundancy, retaining only the most informative and diverse ones. This approach allows for significant memory savings, up to 90%, and improved throughput (up to 6.6x) during long CoT generation, often with zero or even negative accuracy loss. R-KV is a plug-and-play, training-free solution that acts as a lightweight wrapper for any autoregressive LLM, making it easy to integrate into existing inference pipelines or RL roll-outs.
snake-ga
snake-ga is an AI agent designed to learn how to play the classic Snake game from scratch using Deep Reinforcement Learning. The project leverages Deep Q-Learning, where the system receives state parameters and rewards based on its actions, gradually developing a strategy to maximize its score without explicit game rules. This approach enables the AI to achieve scores up to 50 points with a solid strategy after only five minutes of training. The tool also supports Bayesian Optimization to fine-tune the parameters of the Deep neural network and other Deep RL aspects. Implemented in Pytorch, it offers a robust platform for experimenting with AI in game environments.
sdxs
SDXS provides real-time one-step latent diffusion models with image conditions, enabling rapid image generation. It boasts impressive inference speeds, generating 512x512 images at 100 FPS and 1024x1024 images at 30 FPS on a single GPU, making it 30x faster than SD v1.5 and 60x faster than SDXL for comparable image quality within a one-second generation limit. The tool also supports training ControlNet, expanding its applications to image-conditioned control and efficient image-to-image translation. SDXS utilizes a lightweight image decoder and a block removal distillation strategy for model acceleration, alongside a feature matching loss for efficient one-step model finetuning.