DNA copy-number alterations (CNAs) have been recognized as a genomic hallmark of cancer for over a century. However, the connection between CNAs and a patient's prognosis is not well understood. Modern microarray and sequencing technologies have led to databases of genome-scale CNA profiles from patient-matched tumor and normal tissue samples. Such datasets that measure multiple aspects of a single phenomenon comprise multiple data tensors (multidimensional arrays) that are matched in every dimension except one. These datasets contain fundamental patterns that describe relations among the variables and encode a system's behavior. Our research lab previously introduced the concept of comparative spectral decompositions, which can find, compare, and contrast those fundamental patterns to create a single coherent model. Other methods often lose essential information by flattening the data tensors.
In this dissertation, we generalize the existing comparative spectral decompositions to enable simultaneous analysis of multiple data tensors, utilizing their dimensionality. First, we develop the tensor generalized singular value decomposition (GSVD) for modeling two data tensors. We demonstrate the tensor GSVD by analyzing CNAs in ovarian cancer patients. For many cancers, including lung, uterine, and ovarian adenocarcinomas, the best predictor of patient survival has remained the tumor's stage. Physicians have sought genomic prognostic and diagnostic indicators as less subjective metrics with the potential to provide a molecular basis of the disease. The tensor GSVD uncovered three tumor-exclusive chromosome arm-wide patterns of CNAs that are correlated with patient survival in both the general and platinum-based chemotherapy-treated populations independent of the tumor's stage and throughout the disease. These results demonstrate the ability of comparative spectral decompositions to uncover meaningful patterns in data tensors that can, for example, inform patient treatment. Second, we further the utility of these patterns by mapping them to different measurement platforms and technologies. Furthermore, we find that one of the patterns is also prognostic for lung and uterine adenocarcinomas. Third, we extend the tensor GSVD to the tensor higher order (HO) GSVD, which can analyze more than two data tensors simultaneously. The tensor GSVD and tensor HO GSVD are shown to generalize the definition and interpretation of the GSVD to multiple tensors.