Long non-coding RNAs (lncRNAs) have been implicated in normal cellular homeostasis as well as pathophysiological conditions, including cancer. Here we performed global gene expression profiling of mammary epithelial cells transformed by oncogenic v-Src, and identified a large subset of uncharacterized lncRNAs potentially involved in breast cancer development. Specifically, our analysis revealed a novel lncRNA, LINC00520 that is upregulated upon ectopic expression of oncogenic v-Src, in a manner that is dependent on the transcription factor STAT3. Similarly, LINC00520 is also increased in mammary epithelial cells transformed by oncogenic PI3K and its expression is decreased upon knockdown of mutant PIK3CA. Additional expression profiling highlight that LINC00520 is elevated in a subset of human breast carcinomas, with preferential enrichment in the basal-like molecular subtype. ShRNA-mediated depletion of LINC00520 results in decreased cell migration and loss of invasive structures in 3D. RNA sequencing analysis uncovers several genes that are differentially expressed upon ectopic expression of LINC00520, a significant subset of which are also induced in v-Src-transformed MCF10A cells. Together, these findings characterize LINC00520 as a lncRNA that is regulated by oncogenic Src, PIK3CA and STAT3, and which may contribute to the molecular etiology of breast cancer.
The high rate of metastasis and recurrence among melanoma patients indicates the existence of cells within melanoma that have the ability to both initiate metastatic programs and bypass immune recognition. Here, we identify CD47 as a regulator of melanoma tumor metastasis and immune evasion. Protein and gene expression analysis of clinical melanoma samples reveals that CD47, an anti-phagocytic signal, correlates with melanoma metastasis. Antibody-mediated blockade of CD47 coupled with targeting of CD271(+) melanoma cells strongly inhibits tumor metastasis in patient-derived xenografts. This therapeutic effect is mediated by drastic changes in the tumor and metastatic site immune microenvironments, both of whichwhich exhibit greatly increased density of differentiated macrophages and significantly fewer inflammatory monocytes, pro-metastatic macrophages (CCR2(+)/VEGFR1(+)), and neutrophils, all of which are associated with disease progression. Thus, antibody therapy that activates the innate immune response in combination with selective targeting of CD271(+) melanoma cells represents a powerful therapeutic approach against metastatic melanoma.
The assessment of protein expression in immunohistochemistry (IHC) images provides important diagnostic, prognostic and predictive information for guiding cancer diagnosis and therapy. Manual scoring of IHC images represents a logistical challenge, as the process is labor intensive and time consuming. Since the last decade, computational methods have been developed to enable the application of quantitative methods for the analysis and interpretation of protein expression in IHC images. These methods have not yet replaced manual scoring for the assessment of IHC in the majority of diagnostic laboratories and in many large-scale research studies. An alternative approach is crowdsourcing the quantification of IHC images to an undefined crowd. The aim of this study is to quantify IHC images for labeling of ER status with two different crowdsourcing approaches, image labeling and nuclei labeling, and compare their performance with automated methods. Crowdsourcing-derived scores obtained greater concordance with the pathologist interpretations for both image labeling and nuclei labeling tasks (83% and 87%), as compared to the pathologist concordance achieved by the automated method (81%) on 5,483 TMA images from 1,909 breast cancer patients. This analysis shows that crowdsourcing the scoring of protein expression in IHC images is a promising new approach for large scale cancer molecular pathology studies.
The International Symposium on Biomedical Imaging (ISBI) held a grand challenge to evaluate computational systems for the automated detection of metastatic breast cancer in whole slide images of sentinel lymph node biopsies. Our team won both competitions in the grand challenge, obtaining an area under the receiver operating curve (AUC) of 0.925 for the task of whole slide image classification and a score of 0.7051 for the tumor localization task. A pathologist independently reviewed the same images, obtaining a whole slide image classification AUC of 0.966 and a tumor localization score of 0.733. Combining our deep learning system’s predictions with the human pathologist’s diagnoses increased the pathologist’s AUC to 0.995, representing an approximately 85 percent reduction in human error rate. These results demonstrate the power of using deep learning to produce significant improvements in the accuracy of pathological diagnoses.
A pathologist's accurate interpretation relies on identifying relevant histopathological features. Little is known about the precise relationship between feature identification and diagnostic decision making. We hypothesized that greater overlap between a pathologist's selected diagnostic region of interest (ROI) and a consensus derived ROI is associated with higher diagnostic accuracy. We developed breast biopsy test cases that included atypical ductal hyperplasia (n=80); ductal carcinoma in situ (n=78); and invasive breast cancer (n=22). Benign cases were excluded due to the absence of specific abnormalities. Three experienced breast pathologists conducted an independent review of the 180 digital whole slide images, established a reference consensus diagnosis and marked one or more diagnostic ROIs for each case. Forty-four participating pathologists independently diagnosed and marked ROIs on the images. Participant diagnoses and ROI were compared with consensus reference diagnoses and ROI. Regression models tested whether percent overlap between participant ROI and consensus reference ROI predicted diagnostic accuracy. Each of the 44 participants interpreted 39-50 cases for a total of 1972 individual diagnoses. Percent ROI overlap with the expert reference ROI was higher in pathologists who self-reported academic affiliation (69 vs 65%, P=0.002). Percent overlap between participants' ROI and consensus reference ROI was then classified into ordinal categories: 0, 1-33, 34-65, 66-99 and 100% overlap. For each incremental change in the ordinal percent ROI overlap, diagnostic agreement increased by 60% (OR 1.6, 95% CI (1.5-1.7), P<0.001) and the association remained significant even after adjustment for other covariates. The magnitude of the association between ROI overlap and diagnostic agreement increased with increasing diagnostic severity. The findings indicate that pathologists are more likely to converge with an expert reference diagnosis when they identify an overlapping diagnostic image region, suggesting that future computer-aided detection systems that highlight potential diagnostic regions could be a helpful tool to improve accuracy and education.
Inflammatory cytokines, like tumor necrosis factor-alpha (TNF-α) and interleukin-6 (IL-6), are elevated in ovarian cancer. Differences in cytokine expression by histologic subytpe or ovarian cancer risk factors can provide useful insight into ovarian cancer risk and etiology. We used ribonucleic acid in situ hybridization to assess TNF-α and IL-6 expression on tissue microarray slides from 78 epithelial ovarian carcinomas (51 serous, 12 endometrioid, 7 clear cell, 2 mucinous, 6 other) from a population-based case-control study. Cytokine expression was scored semiquantitatively, and odds ratios (ORs) and 95% confidence intervals (CIs) were calculated using polytomous logistic regression. TNF-α was expressed in 46% of the tumors, whereas sparse IL-6 expression was seen in only 18% of the tumors. For both markers, expression was most common in high-grade serous carcinomas followed by endometrioid carcinomas. Parity was associated with a reduced risk of TNF-α-positive (OR, 0.3; 95% CI, 0.1-0.7 for 3 or more children versus none) but not TNF-α-negative tumors (P heterogeneity=.02). In contrast, current smoking was associated with a nearly 3-fold increase in risk of TNF-α-negative (OR, 2.8; 95% CI, 1.2-6.6) but not TNF-α-positive tumors (P heterogeneity = .06). Our data suggest that TNF-α expression in ovarian carcinoma varies by histologic subtype and provides some support for the role of inflammation in ovarian carcinogenesis. The novel associations detected in our study need to be validated in a larger cohort of patients in future studies.
Chromosomal translocations encode oncogenic fusion proteins that have been proven to be causally involved in tumorigenesis. Our understanding of whether such genomic alterations also affect non-coding RNAs is limited, and their impact on circular RNAs (circRNAs) has not been explored. Here, we show that well-established cancer-associated chromosomal translocations give rise to fusion circRNAs (f-circRNA) that are produced from transcribed exons of distinct genes affected by the translocations. F-circRNAs contribute to cellular transformation, promote cell viability and resistance upon therapy, and have tumor-promoting properties in in vivo models. Our work expands the current knowledge regarding molecular mechanisms involved in cancer onset and progression, with potential diagnostic and therapeutic implications.
We examined associations between dietary quality indices and breast cancer risk by molecular subtype among 100,643 women in the prospective Nurses' Health Study (NHS) cohort, followed from 1984 to 2006. Dietary quality scores for the Alternative Healthy Eating Index (AHEI), alternate Mediterranean diet (aMED), and Dietary Approaches to Stop Hypertension (DASH) dietary patterns were calculated from semi-quantitative food frequency questionnaires collected every 2-4 years. Breast cancer molecular subtypes were defined according to estrogen receptor (ER), progesterone receptor, human epidermal growth factor 2 (HER2), cytokeratin 5/6 (CK5/6), and epidermal growth factor receptor status from immunostained tumor microarrays in combination with histologic grade. Cox proportional hazards models, adjusted for age and breast cancer risk factors, were used to estimate hazard ratios (HRs) and 95 % confidence intervals (CIs). Competing risk analyses were used to assess heterogeneity by subtype. We did not observe any significant associations between the AHEI or aMED dietary patterns and risk of breast cancer by molecular subtype. However, a significantly reduced risk of HER2-type breast cancer was observed among women in 5th versus 1st quintile of the DASH dietary pattern [n = 134 cases, Q5 vs. Q1 HR (95 % CI) = 0.44 (0.25-0.77)], and the inverse trend across quintiles was significant (p trend = 0.02). We did not observe any heterogeneity in associations between AHEI (p het = 0.25), aMED (p het = 0.71), and DASH (p het = 0.12) dietary patterns and breast cancer by subtype. Adherence to the AHEI, aMED, and DASH dietary patterns was not strongly associated with breast cancer molecular subtypes.
Though patient sex influences response to cancer treatments, little is known of the molecular causes, and cancer therapies are generally given irrespective of patient sex. We assessed transcriptomic differences in tumors from men and women spanning 17 cancer types, and we assessed differential expression between tumor and normal samples stratified by sex across 7 cancers. We used the LincsCloud platform to perform Connectivity Map analyses to link transcriptomic signatures identified in male and female tumors with chemical and genetic perturbagens, and we performed permutation testing to identify perturbagens that showed significantly differential connectivity with male and female tumors. Our analyses predicted that females are sensitive and males are resistant to tamoxifen treatment of lung adenocarcinoma, a finding which is consistent with known male-female differences in lung cancer. We made several novel predictions, including that CDK1 and PTPN1 knockdown would be more effective in males with hepatocellular carcinoma, and SMAD3 and HSPA4 knockdown would be more effective in females with head and neck squamous cell carcinoma. Our results provide a new resource for researchers studying male-female biological and treatment response differences in human cancer. The complete results of our analyses are provided at the website accompanying this manuscript (http://becklab.github.io/SexLinked).
Several intrinsic breast cancer subtypes, possibly representing unique etiologic processes, have been identified by gene expression profiles. Evidence suggests that associations with reproductive risk factors may vary by breast cancer subtype. In the Nurses' Health Studies, we prospectively examined associations of reproductive factors with breast cancer subtypes defined using immunohistochemical staining of tissue microarrays. Multivariate-adjusted Cox proportional hazard models were used to estimate hazard ratios (HRs) and 95% confidence intervals (CIs). Over follow-up, we identified 2,063 luminal A, 1,008 luminal B, 209 HER2-enriched, 378 basal-like and 110 unclassified tumors. Many factors appeared associated with luminal A tumors, including ages at menarche (p(heterogeneity) = 0.65) and menopause (p(heterogeneity) = 0.05), and current HT use (p(heterogeneity) = 0.33). Increasing parity was not associated with any subtype (p(heterogeneity) = 0.76), though age at first birth was associated with luminal A tumors only (per 1-year increase HR = 1.03 95%CI (1.02-1.05), p(heterogeneity) = 0.04). Though heterogeneity was not observed, duration of lactation was inversely associated with risk of basal-like tumors only (7+ months vs. never HR = 0.65 95%CI (0.49-0.87), ptrend = 0.02), p(heterogeneity) = 0.27). Years between menarche and first birth was strongly positively associated with luminal A and non-luminal subtypes (e.g. 22-year interval vs. nulliparous HR = 1.80, 95%CI (1.08-3.00) for basal-like tumors; p(heterogeneity) = 0.003), and evidence of effect modification by breastfeeding was observed. In summary, many reproductive risk factors for breast cancer appeared most strongly associated with the luminal A subtype. Our results support previous reports that lactation is protective against basal-like tumors, representing a potential modifiable risk factor for this aggressive subtype.
UNLABELLED: Pharmacogenomics holds great promise for the development of biomarkers of drug response and the design of new therapeutic options, which are key challenges in precision medicine. However, such data are scattered and lack standards for efficient access and analysis, consequently preventing the realization of the full potential of pharmacogenomics. To address these issues, we implemented PharmacoGx, an easy-to-use, open source package for integrative analysis of multiple pharmacogenomic datasets. We demonstrate the utility of our package in comparing large drug sensitivity datasets, such as the Genomics of Drug Sensitivity in Cancer and the Cancer Cell Line Encyclopedia. Moreover, we show how to use our package to easily perform Connectivity Map analysis. With increasing availability of drug-related data, our package will open new avenues of research for meta-analysis of pharmacogenomic data.
AVAILABILITY AND IMPLEMENTATION: PharmacoGx is implemented in R and can be easily installed on any system. The package is available from CRAN and its source code is available from GitHub.
CONTACT: firstname.lastname@example.org or email@example.com
SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: A major goal of biomedical research is to identify molecular features associated with a biological or clinical class of interest. Differential expression analysis has long been used for this purpose; however, conventional methods perform poorly when applied to data with high within class heterogeneity.
RESULTS: To address this challenge, we developed EMDomics, a new method that uses the Earth mover's distance to measure the overall difference between the distributions of a gene's expression in two classes of samples and uses permutations to obtain q-values for each gene. We applied EMDomics to the challenging problem of identifying genes associated with drug resistance in ovarian cancer. We also used simulated data to evaluate the performance of EMDomics, in terms of sensitivity and specificity for identifying differentially expressed gene in classes with high within class heterogeneity. In both the simulated and real biological data, EMDomics outperformed competing approaches for the identification of differentially expressed genes, and EMDomics was significantly more powerful than conventional methods for the identification of drug resistance-associated gene sets. EMDomics represents a new approach for the identification of genes differentially expressed between heterogeneous classes and has utility in a wide range of complex biomedical conditions in which sample classes show within class heterogeneity.
AVAILABILITY AND IMPLEMENTATION: The R package is available at http://www.bioconductor.org/packages/release/bioc/html/EMDomics.html.
Alcohol consumption is a consistent risk factor for breast cancer, although it is unclear whether the association varies by breast cancer molecular subtype. We investigated associations between cumulative average alcohol intake and risk of breast cancer by molecular subtype among 105,972 women in the prospective Nurses' Health Study cohort, followed from 1980 to 2006. Breast cancer molecular subtypes were defined according to estrogen receptor (ER), progesterone receptor, human epidermal growth factor 2 (HER2), cytokeratin 5/6, and epidermal growth factor status from immunostained tumor microarrays in combination with histologic grade. Multivariable Cox proportional hazards models were used to estimate hazard ratios (HR) and 95% confidence intervals (CI). Competing risk analyses were used to assess heterogeneity by subtype. We observed suggestive heterogeneity in associations between alcohol and breast cancer by subtype (phet = 0.06). Alcohol consumers had an increased risk of luminal A breast cancers [n = 1,628 cases, per 10 g/day increment HR (95%CI) = 1.10(1.05-1.15)], and an increased risk that was suggestively stronger for HER2-type breast cancer [n = 160 cases, HR (95%CI) = 1.16(1.02-1.33)]. We did not observe statistically significant associations between alcohol and risk of luminal B [n = 631 cases, HR (95%CI) = 1.08(0.99-1.16)], basal-like [n = 254 cases, HR (95%CI) = 0.90(0.77-1.04)], or unclassified [n = 87 cases, HR (95%CI) = 0.90(0.71-1.14)] breast cancer. Alcohol consumption was associated with increased risk of luminal A and HER2-type breast cancer, but not significantly associated with other subtypes. Given that ERs are expressed in luminal A but not in HER2-type tumors, our findings suggest that other mechanisms may play a role in the association between alcohol and breast cancer.
Traditional markers mammaglobin and GCDFP15 show good specificity but lack sensitivity and can be difficult to interpret in small tissue samples. We undertook a comparative study of the novel nuclear marker GATA3 (expression typically restricted to breast and urothelial carcinomas) and GCDFP15 and mammaglobin. We first compared quantitative mRNA expression levels of these 3 markers across a diverse set of over 6000 tumors and 500 normal samples from The Cancer Genome Atlas which showed dramatically higher GATA3 expression (>10-fold higher) in breast cancer as compared with GCDFP15 or mammaglobin (both P<2.2e-16), suggesting that GATA3 may represent a more sensitive marker of breast cancer than GCDFP15 or mammaglobin. We next examined protein expression by immunohistochemistry in 166 cases (including surgical and cytology specimens) of metastatic breast carcinoma and 54 cases with available matched primaries. One whole-slide section from each case was stained for monoclonal GATA3 (L50-823), monoclonal mammaglobin (31A5), and monoclonal GCDFP15 (EP1582Y). Staining intensity (0 to 3+) and extent (0% to 100%) were scored with an H-score calculated (range, 0 to 300). Sensitivities by varying H-score cutoffs for a positive result in metastatic breast carcinoma among GATA3/GCDFP15/mammaglobin, respectively, were as follows: any H-score=95%/65%/78%, H-score>50=93%/37%/47%, H-score>100=90%/25%/27%, H-score>150=86%/21%/19%, H-score>200=73%/18%/9%, H-score>250=66%/14%/6%. Significant staining differences by specimen type, tumor subtype/grade, or ER/PR/HER2 status were not identified. Significantly stronger correlation was observed between primary/metastatic GATA3 expression [Pearson's correlation=0.81 (0.68-0.89)] as compared with the primary/metastatic correlations of GCDFP15 [Pearson's correlation=0.57 (0.33-0.74)] and mammaglobin [Pearson's correlation=0.50 (0.24-0.70)] (both P<0.05). In conclusion, the novel marker GATA3 stains a significantly higher proportion of both primary and metastatic breast carcinomas than GCDFP15 or mammaglobin with stronger and more diffuse staining, helpful in cases with small tissue samples. The matched primary/metastatic expression of GATA3 is also more consistent. We propose that GATA3 be included among a panel of confirmatory markers for metastatic breast carcinoma.
In the year-end editorial, the PLOS Medicine editors ask 11 researchers and clinicians about the most relevant challenges, promising research, and important initiatives in their fields as we head into 2016.