Welcome to DataX Lab

DataX Lab, located in 3237 at SEB@UNLV, is established in 2015 by Dr. Mingon Kang. We aim to develop efficient algorithms to solve challenging computational problems in Bioinformatics and Big Data Aanalytics, and to devote ourseleves to providing computational analysis tools for research use. To be more specific, we are working on:

  • Interpretable and Integrative Deep Learning in Bioinformatics
  • Next Generation Sequencing data analysis
  • Multi-omics data analysis
  • Clinical and Translational Research
  • Intelligent Systems in Computer Vision

Research Projects


A High-Dimensional LASSO(Hi-LASSO) improves a LASSO model providing better performance of both prediction and feature selection on extremely high-dimensional data. (Kim et al, IEEE Access, 2019)


The deep learning for Histopathology (Deep-Hipo) can accurately analyze histopathological images by learning multi-scale morphological patterns from various magnification levels of patches in a WSI simultaneously. (Kosaraju et al, 2020)


PAGE-Net integrates histopathological images and genomic data for survival prediction. PAGE-Net not only to improve survival prediction, but also to identify genetic and histopathological patterns that cause different survival rates in patients. (Hao and Kosaraju et al, PSB 2020)


The pathway-based neural network(Cox-PASNet) integrates high-dimensional gene expression data and clinical data on a sparse neural network architecture for survival analysis. (Hao et al, BMC Bioinformatics, 2019)


Document Layout Classification Using Texture-based CNN(DoT-Net) effectively and simultaneously classifies multiple classes of document blocks. (Kosaraju et al, ICDAR, 2019)

RC4 ciphertext analysis

Detecting encrypted malware packets without decryption is extremely challenging. We develop efficient and effective schemes that detects malware packets Detecting malware packets encrypted by Key-Scheduling Algorithm (e.g., RC4) without decription.


Gene to Pathway(G2P) quantifies biological pathways directly from gene expression, rather than ranking biological pathways using treatment and control groups. G2P can be used for ranking biological pathways, pathway-based network analysis, and pathway-based prediction of clinical outcomes such as survival analysis.

Predicting EC number

Identifying of an enzyme function plays a critical role in several applications, such as enzyme-deficient disease diagnosis and energy generation from biomass. Enzyme Commission (EC) numbers of four digits (e.g., indicate enzyme’s functionality and metabolisms based on catalytic chemical reaction. Predicting enzyme’s function using automatic computational models from genome sequence has been prevalent, due to cost and time-consuming biological experiments. In this project, we aim to develop an effective and efficient machine learning model that predicts four-digit EC numbers to determine functions of an enzyme.

Media sync

Media Sync is a program that allows for frame-perfect syncing of video and image media formats across multiple devices. The system consists of a single server producing audio and keeping track of frame data, with any number of clients connecting to the server and displaying the current frame. Media Sync also allows for a multi-screen setup where different sections of a video frame may be displayed on separate screens.

This page is created by Hyeryeong Seo at April, 2020. ©DataX Lab since 2015.