.:. Selected Research Projects & Publications .:.

.:. iNNfovis: Neural-Network-Augmented Information Visualization

Large amounts of high-dimensional biomedical data not only create the need for the analysis of data and interpretation of results, but also the need for the development of tools and methods that can handle such data. Many techniques commonly used to analyze such data are graphical in nature with low intrinsic dimensionality - ability to unambiguously represent the data dimensionality. In fact, they can represent only a small number of variables at a time, requiring users to discard potentially useful data. The algorithms we have developed are based on Kohonen’s self-organizing map algorithm where the (dis)placement of records in a neural-network augmented plot is directed by all (or a subset of) dimensional values, which significantly improves the treatment of dimensional values. Therefore, users no longer have to choose which dimensions to use to visualize the data in a classic low-dimensional visualization. Moreover, these techniques reveal non-trivial patterns and correlations in data as well as organize large data sets of multi-dimensional records into meaningful groups. Records exhibiting similar patterns across multiple dimensions map in proximity of each other, while preserving their original dimensional values and positions along the primary mapping coordinates in classic, well understood visualizations. Neural network-augmented visualization techniques enhance knowledge extraction, targeting complex data and providing for a very small, if any, loss of information. Our user studies, where users were tasked to find relationships in different datasets, showed increased insight with minimal or no user training. Additionally, it is interesting to note that most users indicated that they could clearly see patterns that were not intuitively obvious using original low-dimensional visualization techniques.

.:. Application of machine learning and visualization of heterogeneous data sets to uncover relationships between translation and developmental stage expression of C. elegans mRNAs

The relationships between genes in neighboring clusters in a self-organizing map (SOM) and properties attributed to them are sometimes difficult to discern, especially when heterogeneous datasets are used. We have developed a novel approach to identify correlations between heterogeneous data sets. One data set, derived from microarray analysis of polysomal distribution, contained changes in the translational efficiency of Caenorhabditis elegans mRNAs resulting from loss of specific eIF4E isoform. The second data set contained expression patterns of mRNAs across all developmental stages. Two algorithms were applied to these data sets: a classical scatter plot and an SOM and their outputs were linked using a two-dimensional color scale. This approach revealed that an mRNA’s eIF4E dependent translational efficiency is strongly dependent on its expression during development. This correlation was not detectable with traditional, widely utilized one-dimensional color scale.

.:. Identification, Tracking and Visualization of Platelets in Intravital Microscopy

Intravital microscopy permits observation of live events in intact tissues to study a variety of issues, including quantifying cell-vessel wall interactions. The analysis of the parameters is labor-intensive, subjective and limited to broad categories of blood cell-vessel wall interactions. We have developed an algorithmic approach that aids the analysis, automatically and objectively detects and tracks platelets and expands information derived from such videos. We integrate computer vision and break the identification, tracking and visualization into well defined steps. In addition, we enable computationally efficient means of eliminating movement within a video, based on positional shifting of an identified feature.

.:. Visual and analytical tools for biomedical informatics

We have been applying visualization, data mining and statistical techniques to large multivariate or high-dimensional data sets in life sciences, from gene expression arrays, protein structure models to miRNAs and motifs. Chemists and biologists are visually oriented and find it easier to analyze and memorize graphical presentations of structures and compounds and they greatly benefit from visualization tools that help them not only present, but also analyze and process the results using the appropriate computational methods coupled with effective and efficient visualizations. Under the NCRR/IDEA program, we investigated clustering and existing cluster analysis methods in order to devise new techniques for analysis of multiple clustering results. Many optimization problems in bioinformatics have been identified as computationally intractable, and that includes the number of clusters for a clustering algorithm which remains NP-hard to compute. The visual tools were developed as part of a suite called Castle and include new analytical and visual tools and techniques to provide insights into single and multiple clustering algorithm results. This work facilitates the analysis of high-dimensional data utilizing visual techniques for data presentation. The new approaches for measuring cluster differences provide information on proximity and provide a means for visual projection of the records including the largest number of common cluster memberships among the records. This work is to be published within the six months, and the techniques developed are used as part of SBIR proposals. This project alone provided research experience for 21 undergraduate and 9 graduate/master’s students in the past 5 years. We will utilize the suite as part of available tools in the NCRR/IDEA BioComputing/Bioinformatics Core (2010-2015).

.:. Identification of markers for bladder premalignancy

This work includes DNA microarray analysis of the UPII-S40Tag mouse model for invasive bladder transitional cell carcinoma (TCC) that we worked on as part of a multidisciplinary team with Dr. John Clifford (LSUHSC-S, Biochemistry) and Dr. Anita L. Sabichi (MD Anderson Cancer Center) and others. We combined a transgenic mouse model for invasive bladder cancer (UPII-SV40Tag mice) with DNA microarray technology to determine molecular mechanisms involved in early TCC development and to identify new biomarkers for detection, diagnosis, and prognosis of TCC. We identified genes that are differentially expressed between the bladders of UPII-SV40Tag mice and their age-matched wild-type littermates at 3, 6, 20, and 30 weeks of age. These are ages that correspond to premalignant, carcinoma in situ, and early-stage and later stage invasive TCC, respectively.

  • The biomedical laboratory provided the animal models and created the molecular and cell biological studies based on a hypothesis, and its results were computationally processed using different algorithms for microarray data analysis and visualization, and combined with techniques to enrich the data with additional information from existing databases.

  • The computation identified an initial set of differentially expressed genes, which we further filtered in order to obtain of a list of genes that were potentially interesting to explore further.

  • These genes are currently in the clinical trials (Phase III Chemoprevention) and under investigation as biomarkers by MD Anderson Cancer Center.

  • The results of step 3 will be used for generation of additional hypotheses and will feed back into the biomedical laboratory for further studies that will determine the mechanisms through which they operate.

The first manuscript of this work was published in 2009 in Cancer Prevention Research and we are expecting to publish a second study within the next year. Future goals include the progression of this study to utilize human samples: bladder biopsy samples and urine samples, further narrowing the field and identifying correlations with these samples and making advances in prevention and non-invasive early diagnosis of TCC that leads to better outcomes for the 4th in incidence of all cancers in developed world.

.:. Analysis of transcriptional and translational characteristics of bone marrow stem cells

Together with Dr. Nadejda Korneeva (LSUHSC-S, Biochemistry), we’re currently working on a project of polysomal profiling of expressed mRNA in mesenchymal stem cells (MSC). Mesenchymal stem cells (MSC) are a unique stem cell population capable of undergoing both self-renewal and differentiation into different cell lineages including hepatic, neural/glial and cardiovascular lineages (i.e. immune response, apoptosis, adhesion, etc.). Adult stem cells have a potential advantage in cell therapies in that one can in principle overcome immunological hurdles by using the patient’s own cells expanded in culture and then reintroduced into the tissue to be regenerated. Little is known, however, about the molecular programs and signaling genes involved in pluripotency and plasticity.
The hypothesis of this project is that we can identify these inefficiently translated mRNAs that respond to environmental stimuli. Computational analysis is based on Affymetrix exon array data (with approximately 6 million probes) from three donors. Dr. Korneeva has created three separate sets from each donor’s stem cells: total RNA, restricted translation and high translation sample (using centrifugation in sucrose), resulting in a total of 9 Exon arrays. After a thorough evaluation of computational approaches available for exon data I selected the most appropriate approach and combined with a limited statistical analysis obtained a set of 12,375 genes that are active in all three samples: a set of translationally activated (potential stem cell signatures), set of translationally repressed (potential differentiation markers) and a set of translationally nonactive (could be shifted either way in certain conditions).
The first step in future research is going to be enrichment of the data with information from existing databases and published work, to narrow the set. The second step is to use the bone marrow MSC signatures to: confirm the mRNA expression and translation by real-time PCR and confirming the protein expression by Western-Blot analysis. The third step is to expand the analysis to a larger number of donors in order to confirm the results, and possibly expand it to an R01 proposal.

.:. Application of Head Tracking for Interactive Platform-Independent Data Visualization

The utilization of head tracking in gaming and animation applications has increased due to greater availability of relevant libraries and hardware. However, its application to interactive data visualization has not followed the same trajectory. As part of this project, we explore the application of head tracking as a human computer interaction technique to make data visualization more intuitive, and to provide natural interaction with high dimensional data models. Head tracking is achieved through a combination of face and eye detection. Using the eyes as reference points allows us to determine head distance as well as horizontal and vertical position relative to the capture device. We utilize the Haar classifier functions for face and eye detection combined with Kalman filtering to remove any resulting transient input. We extend these approaches by calculating the user movement vector, which further reduces the search space by decreasing the base search margin and extending it in the direction of the movement vector.

.:. The Storage, Retrieval, and Visualization of Biological Pathway Data

Biological pathways are key to the existence of all life forms. They illustrate how organisms can take one or more molecules and utilize them in such a way that yields products which can be crucial to it sustaining life. It is rare that these are one or two step procedures or very linear mechanisms. Typically, pathways are multi-stepped occurrences with many key intermediates. This complexity can make them difficult to read and properly understand. We present a system in which we store and utilize pathway data to programmatically create highly informative visualizations of biomedical data.

.:. Selected Publications

Substitution with AlternatiVe Anti-TNFa Therapy (SaVANT) - Outcomes of a Crohn's Disease Cohort Undergoing Substitution Therapy with Certolizumab. M. Boktor, A. Motlis, A. Aravantagi, A. Sheth, P. Jordan, J. Morris, K. Manas, U. Cvek, M. Trutschl, J.S. Alexander. Inflammatory Bowel Diseases (LWW Journal) 2016 Jun; 22(6):1353-61. doi: 10.1097/MIB.0000000000000765.

Gastric cancer in women: A regional health-center seven year retrospective study. K. Suryawala, D. Soliman, M. Mutyala, S. Nageeb, M. Boktor, A. Seth, A. Aravantagi, A. Sheth, J. Morris, P. Jordan, K. Manas, U. Cvek, M. Trutschl, F. Becker, and J.S. Alexander. World Journal of Gastroenterology: WJG. 2015; 21(25): 7805-7813, doi:10.3748/wjg.v21.i25.7805.

Blood circulating microparticle species in relapsing–remitting and secondary progressive multiple sclerosis. J.S. Alexander, R. Chervenak, B. Weinstock-Guttman, I. Tsunoda, M. Ramanathan, N. Martinez, S. Omura, F. Sato, G.V. Chaitanya, A. Minagar, J. McGee, M.H. Jennings, C. Monceaux, F. Becker, U. Cvek, M. Trutschl, R. Zivadinov. A case–control, cross sectional study with conventional MRI and advanced iron content imaging outcomes. J. Neurological Sciences (2015), 355(1): 84-89.

Both MC1 and MC3 Receptors Provide Protection From Cerebral Ischemia-Reperfusion–Induced Neutrophil Recruitment. P.M. Holloway, P.F. Durrenberger, M. Trutschl, U. Cvek, D. Cooper, A.W. Orr, M. Perretti, S.J. Getting, F. N.E. Gavins. Arteriosclerosis Thrombosis and Vascular Biolology, June 25, 2015, doi:10.1161/ATVBAHA.115.305348.

MotifMutator: A Combinatoric Tool for Modeling Binding-Site Preferences, P.C.S.R. Kilgore, U. Cvek, M. Trutschl, B. Praslicka, C. Gissendanner. 7th Int. Conf. on Bioinformatics and Computational Biology (BICoB), March 2015.

Detection and Employment of Biological Sequence Motifs. M. Trutschl, P.C.S.R., Kilgore, R. Scott, C.E. Birdwell, U. Cvek. Big Data Analytics in Bioinformatics and Healthcare. B. Wang, R. Li, and W. Perrizo (Editors). IGI Global, 2015.

Bioinformatics Multivariate Analysis Determined a Set of Phase-Specific Biomarker Candidates in a Novel Mouse Model for Viral Myocarditis. S. Omura, E. Kawai, F. Sato, N.E. Martinez, G.V. Chaitanya, P.A. Rollyson, U.Cvek, M. Trutschl, J.S. Alexander, I. Tsunoda. Circulation: Cardiovascular Genetics (2014) 7: 444-454.

Genome-wide DNA methylation as an epigenetic consequence of Epstein-Barr virus infection of immortalized keratinocytes. C.E. Birdwell, K.J. Queen, P. Kilgore, P. Rollyson, M. Trutschl, U. Cvek, R.S. Scott. Journal of virology (2014): JVI-00972.

Multidimensional Visualization of Microarray Data (Chapter). U. Cvek, M. Trutschl. Microarray Image and Data Analysis: Theory and Practice, Cat/ISBN: K20311/9781466586826, Edited by L. Rueda, CRC Press, March 2014.

Scalable Genome-Wide Discovery and Presentation of Motifs. M. Trutschl, P. Kilgore, U. Cvek, R. Scott. 6th Int. Conf. on Bioinformatics and Computational Biology (BICoB), March 2014.

Panoramic Interaction with Interval Data Based on the Slider Metaphor. P. Kilgore, M. Trutschl, U. Cvek. 7th Int. Conf. on Advances in Computer-Human Interaction, March 2014.

Self-organization in parallel coordinates. M. Trutschl, P.C.S.R. Kilgore, U. Cvek. 23rd International Conference on Artificial Neural Networks, Sofia, Bulgaria, September 2013.

Parallel execution of self-organized visualization. P.C.S.R. Kilgore, M. Trutschl, U. Cvek. Proc. of Modeling, Identification and Control – Advances in Computer Science Conference, DOI: 10.2316/P.2013.801-009, 2013.

Exploration of Gene Expression Data Via Constrained Clustering, P. Kilgore, M. Trutschl, U. Cvek, R. Rhoads. 5th Int. Conf. on Bioinformatics and Computational Biology (BICoB), March 2013.

Application of Head Tracking for Interactive Data Visualization, P. Kilgore, C. McCarthy, U. Cvek, M. Trutschl, 7th Int. Multi-Conference on Computing in the Global Information Technology, 2012.

High-Performance Visualization of Multi-Dimensional Gene Expression Data, M. Trutschl, P.C.S.R. Kilgore, U. Cvek. Third International Conference on Networking and Computing, Okinawa, Japan, December 5-7, 2012.

Multidimensional Visualization Techniques for Microarray Data, U. Cvek, M. Trutschl, P. C. Kilgore, R. Stone II, J. L. Clifford, Symposium on Information visualization in Biomedical Informatics, 8th International Conference on Biomedical Visualization, IEEE Computer Society, London, Great Britain, 2011.

Identification, Tracking and Visualization of Platelets, M. Trutschl, U. Cvek, K. Stokes, P. Kilgore, J. Smith, J. Slack, J. Doss, R. Holloway, Symposium on Information visualization in Biomedical Informatics, 7th Int. Conference on Biomedical Visualization, IEEE Computer Society, London, Great Britain, 2010.

Neural-network enhanced visualization of high-dimensional data, U. Cvek, M. Trutschl, J.L. Clifford, Chapter in: Self- Organizing Maps. ISBM 978-953-7619, 2009.

Identification of genes involved in early stage bladder cancer progression. R. Stone, A.L. Sabichi, J. Gill, I-L Lee, R. Loganantharaj, M. Trutschl, U. Cvek, J.L. Clifford, Cancer Prevention Research (In print, CAPR-09-0189R1). (peer reviewed manuscript)

Identification of the B-Raf/Mek/Erk MAP kinase pathway as a target for all-trans retinoic acid during skin cancer promotion. S.B. Cheepala, W. Yin, Z. Syed, J.N. Gill, A. McMillian, H.E. Kleiner, M. Lynch, R. Loganantharaj, M. Trutschl, U. Cvek, J.L. Clifford, Molecular Cancer, 8:27, May 11, 2009. (peer reviewed manuscript)

The Storage, Retrieval, and Visualization of Biological Pathway Data, E.P. Boswell, J.T. Wessler, U. Cvek, M. Trutschl, Symposium on Information visualization in Biomedical Informatics, 6th International Conference on Biomedical Visualization, 2009. (peer reviewed manuscript)

Multidimensional Visualization Tools for Analysis of Expression Data. U. Cvek, M. Trutschl, R. Stone II, Z. Syed, J.L. Clifford, A.L. Sabichi, International Conference on Bioinformatics and Computational Biology, World Academy of Science, Engineering and Technology, Paris, 2009. (peer reviewed manuscript)

Global Transcriptional Profiling Reveals Streptococcus Agalactiae Genes Controlled by the MtaR transcription factor. J.D. Bryan, R. Liles, U. Cvek, M. Trutschl, D. Shelver. BMC Genomics 2008, 9:607 (16 December 2008). (peer reviewed manuscript)

From Microarrays to Promoters: the Visual Story of Stat3. U. Cvek, M. Trutschl, Z. Syed, J. Clifford. Symposium on Information visualization in Biomedical Informatics, 5th International Conference on Biomedical Visualization, 2008. First two authors equally contributed to this publication. (peer reviewed manuscript)

2D and 3D Neural-Network Based Visualization of High-Dimensional Biomedical Data, U. Cvek, M. Trutschl, J.C. Cannon, R.S. Scott, R.E. Rhoads, Proc. of the Symposium on Information Visualization in Biomedical Informatics, 11th International Conference on Information Visualization, Zurich, Switzerland, July, 2007. First two authors equally contributed to this publication. (peer reviewed manuscript)

Retinoids and skin: microarrays shed new light on chemopreventive action of all-trans retinoic acid. S. B. Cheepala, Z. Syed, M. Trutschl, U. Cvek, J. L. Clifford. Molecular Carcinogenesis, 2007.

Translation initiation factor eIF4G-1 binds to eIF3 through the eIF3e subunit, A.K. LeFebvre, N.L. Korneeva, M. Trutschl, U. Cvek, Roy D. Duzan, C.A. Bradley, J.W. B. Hershey, R.E. Rhoads, J. Biol. Chem. Vol.281, No.32, pp.22917-22932, 2006.

Application of Machine Learning and Visualization of Heterogeneous Datasets to Uncover Relationships between Translation and Developmental Stage Expression of C. elegans mRNAs, M. Trutschl, T. Dinkova, R. Rhoads, Physiol. Genomics, 21:264-273, 2005.

Interpolating Analytic Visualizations, M. Trutschl, G. Grinstein, U. Cvek, Proc. of the SPIE 2004 Conference on Visualization and Data Analysis, San Jose, CA, January, 2004. (peer reviewed manuscript)

Intelligently Resolving Point Occlusion, M. Trutschl, G. Grinstein, U. Cvek, Proc. of the IEEE Symposium on Information Visualization, October, 2003. (peer reviewed manuscript)