ShypdShypd.ai
📚

Research & Education

Browsing page 91 of AI tools for Academic Research in Research & Education. Sorted by confidence score — our independent quality rating.

Pose-Transfer

Pose-Transfer

59%

Pose-Transfer is an open-source project providing the code for person image generation, implementing the Progressive Pose Attention method detailed in a CVPR19 paper. This tool allows users to transfer poses from one image to another, and also supports generating videos from a single input image. It offers functionalities for data preparation, including dataset splitting and keypoint annotation for datasets like Market1501 and DeepFashion. Users can train and test models, and evaluate performance using metrics such as SSIM, IS, DS, and PCKh. The project is built on PyTorch and provides pre-trained models for convenience.

GraphGPT

GraphGPT

59%

GraphGPT is a research framework presented in a SIGIR'24 full paper, focusing on Graph Instruction Tuning for Large Language Models. It enhances LLMs' understanding of graph structural information by aligning graph encoding with natural language space through a text-graph grounding paradigm. The framework employs a dual-stage graph instruction tuning process to adapt language models for graph learning tasks and incorporates Chain-of-Thought (CoT) Distillation to improve reasoning and accuracy, especially with diverse graph data. The repository offers code, data, and model weights, including efficient training scripts for two Nvidia 3090 GPUs, making it a valuable resource for researchers in the field.

rq-vae-transformer

rq-vae-transformer

59%

rq-vae-transformer is the official open-source implementation of "Autoregressive Image Generation using Residual Quantization" (CVPR 2022). This framework, consisting of RQ-VAE and RQ-Transformer, is designed for autoregressive modeling of high-resolution images. It precisely approximates feature maps and represents images as stacks of discrete codes, facilitating the generation of high-quality images. The tool supports image generation using both class and text conditions, with pretrained checkpoints available for various datasets including FFHQ, LSUN, ImageNet, and CC-3M. It also includes a large-scale RQ-Transformer for text-to-image generation, trained on millions of text-image pairs. The repository provides code for training and evaluation pipelines, as well as Jupyter notebooks for easy text-to-image generation.

StyleSwin

StyleSwin

59%

StyleSwin is an official implementation of a transformer-based Generative Adversarial Network (GAN) designed for high-resolution image generation, as presented at CVPR 2022. It leverages a Swin transformer within a style-based architecture, incorporating local and shifted window attention for computational efficiency and modeling capacity. A key innovation is the double attention mechanism, which combines local and shifted window contexts to enhance generation quality. StyleSwin also addresses the challenge of spatial coherency in high-resolution synthesis by employing a wavelet discriminator to suppress blocking artifacts. The tool demonstrates superior performance over prior transformer-based GANs, particularly at resolutions like 1024x1024, achieving competitive results with StyleGAN on datasets such as CelebA-HQ and FFHQ.

tennis_analysis

tennis_analysis

59%

Tennis_analysis is an open-source project designed to analyze tennis players and ball movements within video footage. It leverages advanced computer vision techniques, including YOLO v8 for player detection and a fine-tuned YOLO model for tennis ball detection. Additionally, the tool utilizes Convolutional Neural Networks (CNNs) to accurately extract court keypoints, providing a comprehensive understanding of on-court activity. This project is ideal for individuals looking to enhance their machine learning and computer vision skills through a practical, hands-on application. It measures player speed, ball shot speed, and the total number of shots, offering valuable insights for performance analysis.

whisper-timestamped

whisper-timestamped

59%

whisper-timestamped is an open-source extension of OpenAI's Whisper model, offering multilingual automatic speech recognition with enhanced word-level timestamps and confidence scores. Unlike the original Whisper, it provides more accurate start/end estimations for words and assigns confidence scores to each word and segment. The tool utilizes Dynamic Time Warping (DTW) applied to cross-attention weights for precise alignment, and it's designed to be memory-efficient, capable of processing long audio files. It also integrates Voice Activity Detection (VAD) to prevent hallucinations from silent audio and supports fine-tuned Whisper models from Hugging Face. This makes it ideal for developers and researchers requiring highly accurate and detailed audio transcription.

Archive Intel

Archive Intel

59%

Archive Intel is an AI-powered platform designed for financial firms to ensure compliance with SEC and FINRA regulations. It offers two core solutions: AI Communications Archiving and AI Marketing Review. The communications archiving solution automatically captures and archives all digital client communications, including text (iMessage, Android SMS, WhatsApp), email, chat (Slack, Teams, Zoom, Bloomberg), social media (LinkedIn, YouTube, X/Twitter, Meta), and web content. This system reduces manual compliance workload by up to 95% and cuts false positives by 99%. A key differentiator is its ability to archive text messages from personal phones without requiring additional apps or devices, supporting BYOD policies. The AI Marketing Review solution simplifies content compliance by scanning documents for high-risk terms and providing compliant suggestions, streamlining approval workflows and ensuring audit readiness. Archive Intel offers instant reporting, full audit trails, and customizable pricing based on users and connectors, with no hidden export fees.

ZeroCostDL4Mic

ZeroCostDL4Mic

59%

ZeroCostDL4Mic is a free and open-source toolbox designed to democratize deep learning in microscopy. It consists of a collection of self-explanatory Jupyter Notebooks, hosted on Google Colab, which provides the necessary computational resources at no cost. The tool features an easy-to-use graphical user interface, making it accessible for researchers with little or no coding expertise. Its primary goal is to allow users to quickly test, train, and utilize popular Deep-Learning networks for processing microscopy data. This project originated from a collaboration between the Jacquemet and Henriques laboratories and has expanded with global contributions, as acknowledged in their Nature Communications paper.

multiagent-competition

multiagent-competition

59%

multiagent-competition offers the foundational code for environments detailed in the paper "Emergent Complexity via Multi-agent Competition." This tool is designed for researchers and academics focusing on multi-agent reinforcement learning, providing a platform to simulate and study emergent behaviors in competitive scenarios. It includes agent policies for various environments such as run-to-goal, you-shall-not-pass, sumo, and kick-and-defend tasks. The repository, though archived and read-only, serves as a valuable resource for understanding and replicating the experiments described in the associated paper, allowing for in-depth analysis of complex interactions between AI agents.

Lexroom

Lexroom

59%

Lexroom is an advanced AI platform specifically designed for legal professionals, including lawyers, law firms, and in-house legal teams. It transforms legal research, analysis, and document drafting into efficient processes by leveraging AI to provide verified, citable, and transparent answers. Key features include natural language search, specialized modules for various legal areas (e.g., Banking, Labor, Civil), and a private library for secure document management. Lexroom also offers custom clause drafting and immediate access to original source documents. The platform is built to eliminate AI hallucinations by working exclusively with verified and updated legal sources, ensuring accuracy and reliability for critical legal tasks.

Legora

Legora

59%

Legora is a collaborative AI platform designed to empower lawyers by streamlining routine tasks and enhancing legal work. It enables faster review of vast amounts of material, analyzing tens of thousands of documents simultaneously and suggesting well-crafted markup based on user preferences. The tool also facilitates smarter drafting by drawing on precedent to rewrite and refine content in Word, identifying substance and suggesting ready-to-use language. Furthermore, Legora deepens research capabilities by providing access to up-to-date information, legal databases, and DMS content through integrations with iManage and SharePoint. This allows lawyers to focus on strategic advising and complex problem-solving rather than administrative burdens.

Ordered-Neurons

Ordered-Neurons

59%

Ordered-Neurons is an open-source project offering the code used for word-level language model and unsupervised parsing experiments, as detailed in the paper "Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks." This repository, originally forked from the LSTM and QRNN Language Model Toolkit for PyTorch, requires Python 3.6, NLTK, and PyTorch 0.4. It enables researchers to train language models and perform unsupervised parsing using the Penn Treebank data, with specific scripts provided for each task. The default settings achieve competitive perplexity on the PTB test set and unlabeled F1 on the WSJ test set, making it a valuable resource for academic research in natural language processing.

rl4co

rl4co

59%

rl4co is a comprehensive PyTorch library dedicated to Reinforcement Learning (RL) for Combinatorial Optimization (CO). It offers a unified and flexible framework for developing and benchmarking RL-based CO algorithms, aiming to decouple scientific research from engineering complexities. Built upon TorchRL, TensorDict, PyTorch Lightning, and Hydra, rl4co provides efficient implementations of various policies including constructive (autoregressive and non-autoregressive) and improvement methods. The library also features modular components like environment embeddings, allowing for easy adaptation to new problems. It supports installation via pip and offers clear examples for training models with default or custom configurations, making it accessible for researchers and developers in the field.

Sentiment-Analysis-in-Event-Driven-Stock-Price-Movement-Prediction

Sentiment-Analysis-in-Event-Driven-Stock-Price-Movement-Prediction

59%

Sentiment-Analysis-in-Event-Driven-Stock-Price-Movement-Prediction is an open-source project designed to predict stock price movements using natural language processing (NLP) on news headlines. Specifically, it leverages Reuters news data to build a connection between Bayesian Deep Neural Networks (DNN) and stock price prediction. The methodology involves collecting and preprocessing data, including crawling ticker lists, news from Reuters, and stock prices. It then performs feature engineering through tokenization, unifying word formats, and implementing one-hot encoding. The tool trains Bayesian Convolutional Neural Networks using Stochastic Gradient Langevin Dynamics for robust predictions, which can then be used to forecast stock reactions to news events. It provides scripts for data collection, tokenization, model training, and prediction, making it a comprehensive solution for event-driven stock analysis.

sequence_tagging

sequence_tagging

59%

sequence_tagging is an open-source project hosted on GitHub, providing a robust implementation of Named Entity Recognition (NER) using Tensorflow. This tool utilizes a combination of Long Short-Term Memory (LSTM) networks, Conditional Random Fields (CRF), and character embeddings to achieve state-of-the-art performance in sequence tagging tasks. It is particularly well-suited for researchers and NLP engineers focused on information extraction. The repository includes detailed instructions for setting up the environment, building training data, and evaluating the model, making it accessible for those looking to implement or experiment with advanced NER models. The project also provides guidance on data formatting, aligning with the CoNLL2003 dataset structure, and offers configuration options for integrating pre-trained word vectors like GloVe.

SRCNN-pytorch

SRCNN-pytorch

59%

SRCNN-pytorch offers a PyTorch implementation of the 'Image Super-Resolution Using Deep Convolutional Networks' model (ECCV 2014). This tool is designed to enhance the resolution of images, providing a practical solution for super-resolution tasks. Key differences from the original implementation include the addition of zero-padding, the use of the Adam optimizer instead of SGD, and the removal of specific weight initialization. Users can train the model with custom datasets or utilize provided pre-trained weights for various scales. It supports datasets like 91-image and Set5, allowing for training and evaluation of image upscaling capabilities.

SRCNN-Tensorflow

SRCNN-Tensorflow

59%

SRCNN-Tensorflow is an open-source implementation of Super-Resolution Convolutional Neural Networks (SRCNN) using TensorFlow. This tool is designed to enhance the resolution of images by applying deep learning techniques, specifically convolutional neural networks. It provides a practical way to reproduce the results described in the original research paper, offering a robust solution for image upscaling. The implementation requires TensorFlow, Scipy (version > 0.18), h5py, and matplotlib. Users can train the model with their own datasets or use the provided pre-trained model for testing. The project details the training process and provides example results, demonstrating its capability to produce super-resolved images comparable to reference papers.

suiron

suiron

59%

Suiron is an open-source project dedicated to applying machine learning principles to RC cars, offering a platform for developing and testing autonomous navigation and control systems. The project provides a comprehensive set of tools and scripts for collecting data, training neural networks, and visualizing predictions. It supports Python 2.7 and integrates with libraries like TensorFlow for model training. Users can collect data from their RC cars, train models based on this data, and then visualize how the trained models predict car behavior. This makes Suiron an excellent resource for robotics enthusiasts, machine learning students, and researchers interested in practical applications of AI in autonomous systems.

Stock-Price-Prediction-LSTM

Stock-Price-Prediction-LSTM

59%

Stock-Price-Prediction-LSTM is an open-source project designed for predicting the OHLC average stock price of Apple Inc. utilizing a Long Short-Term Memory (LSTM) recurrent neural network. The tool processes historical stock data, specifically Open, High, Low, and Closing Prices from Yahoo Finance, dating from January 2011 to August 2017. It employs data pre-processing to convert the OHLC average into two-column time series data, with all values normalized between 0 and 1. The model, built using Keras, consists of two sequential LSTM layers and one dense layer, trained with 75% of the data using the Adagrad optimizer. It provides predictions for future stock values with a focus on quantitative trading decisions.

text-summarization-tensorflow

text-summarization-tensorflow

59%

text-summarization-tensorflow is an open-source project providing a TensorFlow implementation of text summarization. It utilizes a seq2seq library with an encoder-decoder model, incorporating an attention mechanism for improved performance. The tool initializes word embeddings using Glove pre-trained vectors and employs LSTM cells for both encoding and decoding processes. It supports training with custom datasets and offers options for configuring hyperparameters such as network size, depth, beam width, and learning rate. Users can also test the model with pre-trained weights and evaluate performance using ROUGE metrics. This tool is ideal for researchers and students looking to understand and experiment with text summarization techniques.

tensorforce

tensorforce

59%

Tensorforce is an open-source deep reinforcement learning framework built on TensorFlow, designed for both research and practical applications. It stands out for its modular, component-based design, allowing for highly configurable feature implementations. A key differentiator is the separation of the RL algorithm from the application, making algorithms agnostic to input and output structures. The entire reinforcement learning logic, including control flow, is implemented in TensorFlow, enabling portable computation graphs. It supports a wide range of features including various network layers, memory types, policy distributions, reward estimation, training objectives, and optimization algorithms. Tensorforce also offers extensive exploration techniques, preprocessing options, and regularization methods, making it a versatile tool for developing and training reinforcement learning agents.

trfl

trfl

59%

TRFL (pronounced "truffle") is an open-source library developed by Google DeepMind, designed to simplify the implementation of Reinforcement Learning (RL) agents using TensorFlow. It offers a collection of essential building blocks and loss functions, such as Q-learning, that are crucial for developing and experimenting with various RL algorithms. The library integrates seamlessly with existing TensorFlow environments, allowing developers to leverage its powerful computational graph capabilities. TRFL does not list TensorFlow as a direct requirement, giving users flexibility to install specific CPU or GPU versions, along with TensorFlow Probability, separately. This modular approach makes it a valuable resource for researchers and practitioners in the field of AI and machine learning.

UniAnimate

UniAnimate

59%

UniAnimate is an open-source framework designed to enable efficient and long-term human video generation using unified video diffusion models. It addresses limitations in existing techniques by mapping reference images, posture guidance, and noise video into a common feature space, reducing optimization burden and ensuring temporal coherence. The tool supports a unified noise input for random or first-frame conditioned input, enhancing long-term video generation capabilities. UniAnimate also explores an alternative temporal modeling architecture based on state-space models to replace computation-consuming temporal Transformers, allowing for the generation of highly consistent videos up to one minute in length by iteratively employing a first-frame conditioning strategy. It provides code and models for human image animation, including features for pose alignment and generating video clips at various resolutions.

vjepa2

vjepa2

59%

vjepa2 is an open-source project from Facebook AI Research (FAIR) providing PyTorch code and models for V-JEPA 2 and V-JEPA 2.1, self-supervised learning approaches for video. These models are pre-trained on internet-scale video data to achieve state-of-the-art performance in motion understanding and human action anticipation tasks. V-JEPA 2.1 further refines the training recipe to learn high-quality and temporally consistent dense features, leveraging dense predictive loss, deep self-supervision, and multi-modal tokenizers. The project also includes V-JEPA 2-AC, a latent action-conditioned world model for robot manipulation tasks, demonstrating capabilities like reaching, grasping, and pick-and-place without extensive environment-specific data. It offers pretrained checkpoints and easy integration via PyTorch Hub and HuggingFace.