Find below a collection of our research findings and publications. Our group’s publications start from 2019, and Biswa’s prior research before that.

Sahu Lab publications

Publications from post-doc training

Publications from PhD training


Liangru Fei, Kaiyang Zhang, Nikita Poddar, Sampsa Hautaniemi, Biswajyoti Sahu


Cell fate can be reprogrammed by ectopic expression of lineage-specific transcription factors (TFs). However, the exact cell state transitions during transdifferentiation are still poorly understood.

Here, we have generated pancreatic exocrine cells of ductal epithelial identity from human fibroblasts using a set of six TFs. We mapped the molecular determinants of lineage dynamics using a factor-indexing method based on single-nuclei multiome sequencing (FI-snMultiome-seq) that enables dissecting the role of each individual TF and pool of TFs in cell fate conversion.

We show that transition from mesenchymal fibroblast identity to epithelial pancreatic exocrine fate involves two deterministic steps: an endodermal progenitor state defined by activation of HHEX with FOXA2 and SOX17 and a temporal GATA4 activation essential for the maintenance of pancreatic cell fate program.

Collectively, our data suggest that transdifferentiation—although being considered a direct cell fate conversion method—occurs through transient progenitor states orchestrated by stepwise activation of distinct TFs.

Konsta Karttunen*, Divyesh Patel*, Jihan Xia, Liangru Fei, Kimmo Palin, Lauri Aaltonen and Biswajyoti Sahu


Transposable elements (TE) are repetitive genomic elements that harbor binding sites for human transcription factors (TF). A regulatory role for TEs has been suggested in embryonal development and diseases such as cancer but systematic investigation of their functions has been limited by their widespread silencing in the genome.

Here, we utilize unbiased massively parallel reporter assay data using a whole human genome library to identify TEs with functional enhancer activity in two human cancer types of endodermal lineage, colorectal and liver cancers. We show that the identified TE enhancers are characterized by genomic features associated with active enhancers, such as epigenetic marks and TF binding. Importantly, we identify distinct TE subfamilies that function as tissue-specific enhancers, namely MER11- and LTR12-elements in colon and liver cancers, respectively.

These elements are bound by distinct TFs in each cell type, and they have predicted associations to differentially expressed genes. In conclusion, these data demonstrate how different cancer types can utilize distinct TEs as tissue-specific enhancers, paving the way for comprehensive understanding of the role of TEs as bona fide enhancers in the cancer genomes.

Lisa Gawriyski, Eeva-Mari Jouhilahti, Masahito Yoshihara, Liangru Fei, Jere Weltner, Tomi T Airenne, Ras Trokovic, Shruti Bhagat, Mari H Tervaniemi, Yasuhiro Murakawa, Kari Salokas, Xiaonan Liu, Sini Miettinen, Thomas R Bürglin, Biswajyoti Sahu, Timo Otonkoski, Mark S Johnson, Shintaro Katayama, Markku Varjosalo, Juha Kere


The paired-like homeobox transcription factor LEUTX is expressed in human preimplantation embryos between the 4- and 8-cell stages, and then silenced in somatic tissues. To characterize the function of LEUTX, we performed a multiomic characterization of LEUTX using two proteomics methods and three genome-wide sequencing approaches.

Our results show that LEUTX stably interacts with the EP300 and CBP histone acetyltransferases through its 9 amino acid transactivation domain (9aaTAD), as mutation of this domain abolishes the interactions. LEUTX targets genomic cis-regulatory sequences that overlap with repetitive elements, and through these elements it is suggested to regulate the expression of its downstream genes.

We find LEUTX to be a transcriptional activator, upregulating several genes linked to preimplantation development as well as 8-cell-like markers, such as DPPA3 and ZNF280A. Our results support a role for LEUTX in preimplantation development as an enhancer binding protein and as a potent transcriptional activator.


Päivi Pihlajamaa, Otto Kauko, Biswajyoti Sahu, Teemu Kivioja, Jussi Taipale


Here we describe a competitive genome editing method that measures the effect of mutations on molecular functions, based on precision CRISPR editing using template libraries with either the original or altered sequence, and a sequence tag, enabling direct comparison between original and mutated cells.

Using the example of the MYC oncogene, we identify important transcriptional targets and show that E-box mutations at MYC target gene promoters reduce cellular fitness.

Biswajyoti Sahu, Tuomo Hartonen, Päivi Pihlajamaa, Bei Wei, Kashyap Dave, Fangjie Zhu, Eevi Kaasinen, Katja Lidschreiber, Michael Lidschreiber, Carsten O Daub, Patrick Cramer, Teemu Kivioja, Jussi Taipale


DNA can determine where and when genes are expressed, but the full set of sequence determinants that control gene expression is unknown. Here, we measured the transcriptional activity of DNA sequences that represent an ~100 times larger sequence space than the human genome using massively parallel reporter assays (MPRAs).

Machine learning models revealed that transcription factors (TFs) generally act in an additive manner with weak grammar and that most enhancers increase expression from a promoter by a mechanism that does not appear to involve specific TF-TF interactions. The enhancers themselves can be classified into three types: classical, closed chromatin and chromatin dependent.

We also show that few TFs are strongly active in a cell, with most activities being similar between cell types. Individual TFs can have multiple gene regulatory activities, including chromatin opening and enhancing, promoting and determining transcription start site (TSS) activity, consistent with the view that the TF binding motif is the key atomic unit of gene expression.


Pauliina M Munne, Lahja Martikainen, Iiris Räty, Kia Bertula, Nonappa, Janika Ruuska, Hanna Ala-Hongisto, Aino Peura, Babette Hollmann, Lilya Euro, Kerim Yavuz, Linda Patrikainen, Maria Salmela, Juho Pokki, Mikko Kivento, Juho Väänänen, Tomi Suomi, Liina Nevalaita, Minna Mutka, Panu Kovanen, Marjut Leidenius, Tuomo Meretoja, Katja Hukkinen, Outi Monni, Jeroen Pouwels, Biswajyoti Sahu, Johanna Mattson, Heikki Joensuu, Päivi Heikkilä, Laura L Elo, Ciara Metcalfe, Melissa R Junttila, Olli Ikkala, Juha Klefström


Breast cancer is now globally the most frequent cancer and leading cause of women’s death. Two thirds of breast cancers express the luminal estrogen receptor-positive (ERα + ) phenotype that is initially responsive to antihormonal therapies, but drug resistance emerges.

A major barrier to the understanding of the ERα-pathway biology and therapeutic discoveries is the restricted repertoire of luminal ERα + breast cancer models. The ERα + phenotype is not stable in cultured cells for reasons not fully understood.

We examine 400 patient-derived breast epithelial and breast cancer explant cultures (PDECs) grown in various three-dimensional matrix scaffolds, finding that ERα is primarily regulated by the matrix stiffness. Matrix stiffness upregulates the ERα signaling via stress-mediated p38 activation and H3K27me3-mediated epigenetic regulation.

The finding that the matrix stiffness is a central cue to the ERα phenotype reveals a mechanobiological component in breast tissue hormonal signaling and enables the development of novel therapeutic interventions. Subject terms: ER-positive (ER + ), breast cancer, ex vivo model, preclinical model, PDEC, stiffness, p38 SAPK.

Davide G Berta, Heli Kuisma, Niko Välimäki, Maritta Räisänen, Maija Jäntti, Annukka Pasanen, Auli Karhu, Jaana Kaukomaa, Aurora Taira, Tatiana Cajuso, Sanna Nieminen, Rosa-Maria Penttinen, Saija Ahonen, Rainer Lehtonen, Miika Mehine, Pia Vahteristo, Jyrki Jalkanen, Biswajyoti Sahu, Janne Ravantti, Netta Mäkinen, Kristiina Rajamäki, Kimmo Palin, Jussi Taipale, Oskari Heikinheimo, Ralf Bützow, Eevi Kaasinen, Lauri A Aaltonen


One in four women suffers from uterine leiomyomas (ULs)-benign tumours of the uterine wall, also known as uterine fibroids-at some point in premenopausal life. ULs can cause excessive bleeding, pain and infertility, and are a common cause of hysterectomy. They emerge through at least three distinct genetic drivers: mutations in MED12 or FH, or genomic rearrangement of HMGA23.

Here we created genome-wide datasets, using DNA, RNA, assay for transposase-accessible chromatin (ATAC), chromatin immunoprecipitation (ChIP) and HiC chromatin immunoprecipitation (HiChIP) sequencing of primary tissues to profoundly understand the genesis of UL.

We identified somatic mutations in genes encoding six members of the SRCAP histone-loading complex4, and found that germline mutations in the SRCAP members YEATS4 and ZNHIT1 predispose women to UL. Tumours bearing these mutations showed defective deposition of the histone variant H2A.Z.

In ULs, H2A.Z occupancy correlated positively with chromatin accessibility and gene expression, and negatively with DNA methylation, but these correlations were weak in tumours bearing SRCAP complex mutations. In these tumours, open chromatin emerged at transcription start sites where H2A.Z was lost, which was associated with upregulation of genes.

Furthermore, YEATS4 defects were associated with abnormal upregulation of bivalent embryonic stem cell genes, as previously shown in mice5. Our work describes a potential mechanism of tumorigenesis-epigenetic instability caused by deficient H2A.Z deposition-and suggests that ULs arise through an aberrant differentiation program driven by deranged chromatin, emanating from a small number of mutually exclusive driver mutations.

Biswajyoti Sahu, Päivi Pihlajamaa, Kaiyang Zhang, Kimmo Palin, Saija Ahonen, Alejandra Cervera, Ari Ristimäki, Lauri A. Aaltonen, Sampsa Hautaniemi & Jussi Taipale


Cancer is the most complex genetic disease known, with mutations implicated in more than 250 genes. However, it is still elusive which specific mutations found in human patients lead to tumorigenesis.

Here we show that a combination of oncogenes that is characteristic of liver cancer (CTNNB1, TERT, MYC) induces senescence in human fibroblasts and primary hepatocytes. However, reprogramming fibroblasts to a liver progenitor fate, induced hepatocytes (iHeps), makes them sensitive to transformation by the same oncogenes. The transformed iHeps are highly proliferative, tumorigenic in nude mice, and bear gene expression signatures of liver cancer.

These results show that tumorigenesis is triggered by a combination of three elements: the set of driver mutations, the cellular lineage, and the state of differentiation of the cells along the lineage. Our results provide direct support for the role of cell identity as a key determinant in transformation and establish a paradigm for studying the dynamic role of oncogenic drivers in human tumorigenesis.


Helka Göös, Christopher L Fogarty, Biswajyoti Sahu, Vincent Plagnol, Kristiina Rajamäki, Katariina Nurmi, Xiaonan Liu, Elisabet Einarsdottir, Annukka Jouppila, Tom Pettersson, Helena Vihinen, Kaarel Krjutskov, Päivi Saavalainen, Asko Järvinen, Mari Muurinen, Dario Greco, Giovanni Scala, James Curtis, Dan Nordström, Robert Flaumenhaft, Outi Vaarala, Panu E Kovanen, Salla Keskitalo, Annamari Ranki, Juha Kere, Markku Lehto, Luigi D Notarangelo, Sergey Nejentsev, Kari K Eklund, Markku Varjosalo, Jussi Taipale, Mikko R J Seppänen


Background: CCAAT enhancer-binding protein epsilon (C/EBPε) is a transcription factor involved in late myeloid lineage differentiation and cellular function. The only previously known disorder linked to C/EBPε is autosomal recessive neutrophil-specific granule deficiency leading to severely impaired neutrophil function and early mortality.

Objective: The aim of this study was to molecularly characterize the effects of C/EBPε transcription factor Arg219His mutation identified in a Finnish family with previously genetically uncharacterized autoinflammatory and immunodeficiency syndrome.

Methods: Genetic analysis, proteomics, genome-wide transcriptional profiling by means of RNA-sequencing, chromatin immunoprecipitation (ChIP) sequencing, and assessment of the inflammasome function of primary macrophages were performed.

Results: Studies revealed a novel mechanism of genome-wide gain-of-function that dysregulated transcription of 464 genes. Mechanisms involved dysregulated noncanonical inflammasome activation caused by decreased association with transcriptional repressors, leading to increased chromatin occupancy and considerable changes in transcriptional activity, including increased expression of NLR family, pyrin domain-containing 3 protein (NLRP3) and constitutively expressed caspase-5 in macrophages.

Conclusion: We describe a novel autoinflammatory disease with defective neutrophil function caused by a homozygous Arg219His mutation in the transcription factor C/EBPε. Mutated C/EBPε acts as a regulator of both the inflammasome and interferome, and the Arg219His mutation causes the first human monogenic neomorphic and noncanonical inflammasomopathy/immunodeficiency. The mechanism, including widely dysregulated transcription, is likely not unique for C/EBPε. Similar multiomics approaches should also be used in studying other transcription factor-associated diseases.


Fangjie Zhu, Lucas Farnung, Eevi Kaasinen, Biswajyoti Sahu, Yimeng Yin, Bei Wei, Svetlana O Dodonova, Kazuhiro R Nitta, Ekaterina Morgunova, Minna Taipale, Patrick Cramer, Jussi Taipale


Nucleosomes cover most of the genome and are thought to be displaced by transcription factors in regions that direct gene expression. However, the modes of interaction between transcription factors and nucleosomal DNA remain largely unknown.

Here we systematically explore interactions between the nucleosome and 220 transcription factors representing diverse structural families. Consistent with earlier observations, we find that the majority of the studied transcription factors have less access to nucleosomal DNA than to free DNA.

The motifs recovered from transcription factors bound to nucleosomal and free DNA are generally similar. However, steric hindrance and scaffolding by the nucleosome result in specific positioning and orientation of the motifs. Many transcription factors preferentially bind close to the end of nucleosomal DNA, or to periodic positions on the solvent-exposed side of the DNA.

In addition, several transcription factors usually bind to nucleosomal DNA in a particular orientation. Some transcription factors specifically interact with DNA located at the dyad position at which only one DNA gyre is wound, whereas other transcription factors prefer sites spanning two DNA gyres and bind specifically to each of them.

Our work reveals notable differences in the binding of transcription factors to free and nucleosomal DNA, and uncovers a diverse interaction landscape between transcription factors and the nucleosome.

Kimmo Palin, Esa Pitkänen, Mikko Turunen, Biswajyoti Sahu, Päivi Pihlajamaa, Teemu Kivioja, Eevi Kaasinen, Niko Välimäki, Ulrika A Hänninen, Tatiana Cajuso, Mervi Aavikko, Sari Tuupanen, Outi Kilpivaara, Linda van den Berg, Johanna Kondelin, Tomas Tanskanen, Riku Katainen, Marta Grau, Heli Rauanheimo, Roosa-Maria Plaketti, Aurora Taira, Päivi Sulo, Tuomo Hartonen, Kashyap Dave, Bernhard Schmierer, Sandeep Botla, Maria Sokolova, Anna Vähärautio, Kornelia Gladysz, Halit Ongen, Emmanouil Dermitzakis, Jesper Bertram Bramsen, Torben Falck Ørntoft, Claus Lindbjerg Andersen, Ari Ristimäki, Anna Lepistö, Laura Renkonen-Sinisalo, Jukka-Pekka Mecklin, Jussi Taipale, Lauri A Aaltonen


Point mutations in cancer have been extensively studied but chromosomal gains and losses have been more challenging to interpret due to their unspecific nature. Here we examine high-resolution allelic imbalance (AI) landscape in 1699 colorectal cancers, 256 of which have been whole-genome sequenced (WGSed).

The imbalances pinpoint 38 genes as plausible AI targets based on previous knowledge. Unbiased CRISPR-Cas9 knockout and activation screens identified in total 79 genes within AI peaks regulating cell growth.

Genetic and functional data implicate loss of TP53 as a sufficient driver of AI. The WGS highlights an influence of copy number aberrations on the rate of detected somatic point mutations.

Importantly, the data reveal several associations between AI target genes, suggesting a role for a network of lineage-determining transcription factors in colorectal tumorigenesis. Overall, the results unravel the contribution of AI in colorectal cancer and provide a plausible explanation why so few genes are commonly affected by point mutations in cancers.

Bei Wei, Arttu Jolma, Biswajyoti Sahu, Lukas M Orre, Fan Zhong, Fangjie Zhu, Teemu Kivioja, Inderpreet Sur, Janne Lehtiö, Minna Taipale, Jussi Taipale


No existing method to characterize transcription factor (TF) binding to DNA allows genome-wide measurement of all TF-binding activity in cells. Here we present a massively parallel protein activity assay, active TF identification (ATI), that measures the DNA-binding activity of all TFs in cell or tissue extracts.

ATI is based on electrophoretic separation of protein-bound DNA sequences from a highly complex DNA library and subsequent mass-spectrometric identification of the DNA-bound proteins. We applied ATI to four mouse tissues and mouse embryonic stem cells and found that, in a given tissue or cell type, a small set of TFs, which bound to only ∼10 distinct motifs, displayed strong DNA-binding activity.

Some of these TFs were found in all cell types, whereas others were specific TFs known to determine cell fate in the analyzed tissue or cell type. We also show that a small number of TFs determined the accessible chromatin landscape of a cell, suggesting that gene regulatory logic may be simpler than previously appreciated.


Kristian M Silander, Päivi Pihlajamaa, Biswajyoti Sahu, Olli A Jänne, Leif C Andersson


We have investigated and characterized a novel ornithine decarboxylase (ODC) related protein (ODCrp) also annotated as gm853. ODCrp shows 41% amino acid sequence identity with ODC and 38% with ODC antizyme inhibitor 1 (AZIN1).

The Odcrp gene is selectively expressed in the epithelium of proximal tubuli of mouse kidney with higher expression in males than in females. Like Odc in mouse kidney, Odcrp is also androgen responsive with androgen receptor (AR)-binding loci within its regulatory region.

ODCrp forms homodimers but does not heterodimerize with ODC. Although ODCrp contains 20 amino acid residues known to be necessary for the catalytic activity of ODC, no decarboxylase activity could be found with ornithine, lysine or arginine as substrates.

ODCrp does not function as an AZIN, as it neither binds ODC antizyme 1 (OAZ1) nor prevents OAZ-mediated inactivation and degradation of ODC. ODCrp itself is degraded via ubiquination and mutation of Cys363 (corresponding to Cys360 of ODC) appears to destabilize the protein. Evidence for a function of ODCrp was found in ODC assays on lysates from transfected Cos-7 cells where ODCrp repressed the activity of endogenous ODC while Cys363Ala mutated ODCrp increased the enzymatic activity of endogenous ODC.

Yimeng Yin, Ekaterina Morgunova, Arttu Jolma, Eevi Kaasinen, Biswajyoti Sahu, Syed Khund-Sayeed, Pratyush K Das, Teemu Kivioja, Kashyap Dave, Fan Zhong, Kazuhiro R Nitta, Minna Taipale, Alexander Popov, Paul A Ginno, Silvia Domcke, Jian Yan, Dirk Schübeler, Charles Vinson, Jussi Taipale


The majority of CpG dinucleotides in the human genome are methylated at cytosine bases. However, active gene regulatory elements are generally hypomethylated relative to their flanking regions, and the binding of some transcription factors (TFs) is diminished by methylation of their target sequences.

By analysis of 542 human TFs with methylation-sensitive SELEX (systematic evolution of ligands by exponential enrichment), we found that there are also many TFs that prefer CpG-methylated sequences. Most of these are in the extended homeodomain family.

Structural analysis showed that homeodomain specificity for methylcytosine depends on direct hydrophobic interactions with the methylcytosine 5-methyl group.

This study provides a systematic examination of the effect of an epigenetic DNA modification on human TF binding specificity and reveals that many developmentally important proteins display preference for mCpG-containing sequences.

Meri Kaustio, Emma Haapaniemi, Helka Göös, Timo Hautala, Giljun Park, Jaana Syrjänen, Elisabet Einarsdottir, Biswajyoti Sahu, Sanna Kilpinen, Samuli Rounioja, Christopher L Fogarty, Virpi Glumoff, Petri Kulmala, Shintaro Katayama, Fitsum Tamene, Luca Trotta, Ekaterina Morgunova, Kaarel Krjutškov, Katariina Nurmi, Kari Eklund, Anssi Lagerstedt, Merja Helminen, Timi Martelius, Satu Mustjoki, Jussi Taipale, Janna Saarela, Juha Kere, Markku Varjosalo, Mikko Seppänen


Background: The nuclear factor κ light-chain enhancer of activated B cells (NF-κB) signaling pathway is a key regulator of immune responses. Accordingly, mutations in several NF-κB pathway genes cause immunodeficiency.

Objective: We sought to identify the cause of disease in 3 unrelated Finnish kindreds with variable symptoms of immunodeficiency and autoinflammation.

Methods: We applied genetic linkage analysis and next-generation sequencing and functional analyses of NFKB1 and its mutated alleles.

Results: In all affected subjects we detected novel heterozygous variants in NFKB1, encoding for p50/p105. Symptoms in variant carriers differed depending on the mutation. Patients harboring a p.I553M variant presented with antibody deficiency, infection susceptibility, and multiorgan autoimmunity. Patients with a p.H67R substitution had antibody deficiency and experienced autoinflammatory episodes, including aphthae, gastrointestinal disease, febrile attacks, and small-vessel vasculitis characteristic of Behçet disease. Patients with a p.R157X stop-gain experienced hyperinflammatory responses to surgery and showed enhanced inflammasome activation.

In functional analyses the p.R157X variant caused proteasome-dependent degradation of both the truncated and wild-type proteins, leading to a dramatic loss of p50/p105. The p.H67R variant reduced nuclear entry of p50 and showed decreased transcriptional activity in luciferase reporter assays. The p.I553M mutation in turn showed no change in p50 function but exhibited reduced p105 phosphorylation and stability. Affinity purification mass spectrometry also demonstrated that both missense variants led to altered protein-protein interactions.

Conclusion: Our findings broaden the scope of phenotypes caused by mutations in NFKB1 and suggest that a subset of autoinflammatory diseases, such as Behçet disease, can be caused by rare monogenic variants in genes of the NF-κB pathway.


Tuomo Hartonen, Biswajyoti Sahu, Kashyap Dave, Teemu Kivioja, Jussi Taipale


Motivation: Transcription factor (TF) binding can be studied accurately in vivo with ChIP-exo and ChIP-Nexus experiments. Only a fraction of TF binding mechanisms are yet fully understood and accurate knowledge of binding locations and patterns of TFs is key to understanding binding that is not explained by simple positional weight matrix models.

ChIP-exo/Nexus experiments can also offer insight on the effect of single nucleotide polymorphism (SNP) at TF binding sites on expression of the target genes. This is an important mechanism of action for disease-causing SNPs at non-coding genomic regions.

Results: We describe a peak caller PeakXus that is specifically designed to leverage the increased resolution of ChIP-exo/Nexus and developed with the aim of making as few assumptions of the data as possible to allow discoveries of novel binding patterns. We apply PeakXus to ChIP-Nexus and ChIP-exo experiments performed both in Homo sapiens and in Drosophila melanogaster cell lines.

We show that PeakXus consistently finds more peaks overlapping with a TF-specific recognition sequence than published methods. As an application example we demonstrate how PeakXus can be coupled with unique molecular identifiers (UMIs) to measure the effect of a SNP overlapping with a TF binding site on the in vivo binding of the TF.

Availability and implementation: Source code of PeakXus is available at


Päivi Pihlajamaa, Biswajyoti Sahu, Olli A Jänne


The physiological androgens testosterone and 5α-dihydrotestosterone regulate the development and maintenance of primary and secondary male sexual characteristics through binding to the androgen receptor (AR), a ligand-dependent transcription factor. In addition, a number of nonreproductive tissues of both genders are subject to androgen regulation. AR is also a central target in the treatment of prostate cancer.

A large number of studies over the last decade have characterized many regulatory aspects of the AR pathway, such as androgen-dependent transcription programs, AR cistromes, and coregulatory proteins, mostly in cultured cells of prostate cancer origin.

Moreover, recent work has revealed the presence of pioneer/licensing factors and chromatin modifications that are important to guide receptor recruitment onto appropriate chromatin loci in cell lines and in tissues under physiological conditions. Despite these advances, current knowledge related to the mechanisms responsible for receptor- and tissue-specific actions of androgens is still relatively limited.

Here, we review topics that pertain to these specificity issues at different levels, both in cultured cells and tissues in vivo, with a particular emphasis on the nature of the steroid, the response element sequence, the AR cistromes, pioneer/licensing factors, and coregulatory proteins.

We conclude that liganded AR and its DNA-response elements are required but are not sufficient for establishment of tissue-specific transcription programs in vivo, and that AR-selective actions over other steroid receptors rely on relaxed rather than increased stringency of cis-elements on chromatin.

Marjo Malinen, Sari Toropainen, Tiina Jääskeläinen, Biswajyoti Sahu, Olli A Jänne, Jorma J Palvimo


We have analyzed androgen receptor (AR) chromatin binding sites (ARBs) and androgen-regulated transcriptome in estrogen receptor negative molecular apocrine breast cancer cells.

These analyses revealed that 42% of ARBs and 39% androgen-regulated transcripts in MDA-MB453 cells have counterparts in VCaP prostate cancer cells.

Pathway analyses showed a similar enrichment of molecular and cellular functions among AR targets in both breast and prostate cancer cells, with cellular growth and proliferation being among the most enriched functions.

Silencing of the coregulator SUMO ligase PIAS1 in MDA-MB453 cells influenced AR function in a target-selective fashion. An anti-apoptotic effect of the silencing suggests involvement of the PIAS1 in the regulation of cell death and survival pathways.

In sum, apocrine breast cancer and prostate cancer cells share a core AR cistrome and target gene signature linked to cancer cell growth, and PIAS1 plays a similar coregulatory role for AR in both cancer cell types.

Henna Heinonen, Tatiana Lepikhova, Biswajyoti Sahu, Henna Pehkonen, Päivi Pihlajamaa, Riku Louhimo, Ping Gao, Gong-Hong Wei, Sampsa Hautaniemi, Olli A Jänne, Outi Monni


HOXB7 encodes a transcription factor that is overexpressed in a number of cancers and encompasses many oncogenic functions. Previous results have shown it to promote cell proliferation, angiogenesis, epithelial-mesenchymal transition, DNA repair and cell survival.

Because of its role in many cancers and tumorigenic processes, HOXB7 has been suggested to be a potential drug target. However, HOXB7 binding sites on chromatin and its targets are poorly known.

The aim of our study was to identify HOXB7 binding sites on breast cancer cell chromatin and to delineate direct target genes located nearby these binding sites. We found 1,504 HOXB7 chromatin binding sites in BT-474 breast cancer cell line that overexpresses HOXB7. Seventeen selected binding sites were validated by ChIP-qPCR in several breast cancer cell lines.

Furthermore, we analyzed expression of a large number of genes located nearby HOXB7 binding sites and found several new direct targets, such as CTNND2 and SCGB1D2. Identification of HOXB7 chromatin binding sites and target genes is essential to understand better the role of HOXB7 in breast cancer and mechanisms by which it regulates tumorigenic processes.


Sari Toropainen, Marjo Malinen, Sanna Kaikkonen, Miia Rytinki, Tiina Jääskeläinen, Biswajyoti Sahu, Olli A Jänne, Jorma J Palvimo


Androgen receptor (AR) is a ligand-activated transcription factor that plays a central role in the development and growth of prostate carcinoma. PIAS1 is an AR- and SUMO-interacting protein and a putative transcriptional coregulator overexpressed in prostate cancer.

To study the importance of PIAS1 for the androgen-regulated transcriptome of VCaP prostate cancer cells, we silenced its expression by RNAi. Transcriptome analyses revealed that a subset of the AR-regulated genes is significantly influenced, either activated or repressed, by PIAS1 depletion. Interestingly, PIAS1 depletion also exposed a new set of genes to androgen regulation, suggesting that PIAS1 can mask distinct genomic loci from AR access.

In keeping with gene expression data, silencing of PIAS1 attenuated VCaP cell proliferation. ChIP-seq analyses showed that PIAS1 interacts with AR at chromatin sites harboring also SUMO2/3 and surrounded by H3K4me2; androgen exposure increased the number of PIAS1-occupying sites, resulting in nearly complete overlap with AR chromatin binding events. PIAS1 interacted also with the pioneer factor FOXA1. Of note, PIAS1 depletion affected AR chromatin occupancy at binding sites enriched for HOXD13 and GATA motifs.

Taken together, PIAS1 is a genuine chromatin-bound AR coregulator that functions in a target gene selective fashion to regulate prostate cancer cell growth.

Päivi Pihlajamaa, Biswajyoti Sahu, Lauri Lyly, Viljami Aittomäki, Sampsa Hautaniemi, Olli A Jänne


Androgen receptor (AR) binds male sex steroids and mediates physiological androgen actions in target tissues. ChIP-seq analyses of AR-binding events in murine prostate, kidney and epididymis show that in vivo AR cistromes and their respective androgen-dependent transcription programs are highly tissue specific mediating distinct biological pathways.

This high order of tissue specificity is achieved by the use of exclusive collaborating factors in the three androgen-responsive tissues. We find two novel collaborating factors for AR signaling in vivo–Hnf4α (hepatocyte nuclear factor 4α) in mouse kidney and AP-2α (activating enhancer binding protein 2α) in mouse epididymis–that define tissue-specific AR recruitment.

In mouse prostate, FoxA1 serves for the same purpose. FoxA1, Hnf4α and AP-2α motifs are over-represented within unique AR-binding loci, and the cistromes of these factors show substantial overlap with AR-binding events distinct to each tissue type. These licensing or pioneering factors are constitutively bound to chromatin and guide AR to specific genomic loci upon hormone exposure.

Collectively, liganded receptor and its DNA-response elements are required but not sufficient for establishment of tissue-specific transcription programs.

Biswajyoti Sahu, Päivi Pihlajamaa, Vanessa Dubois, Stefanie Kerkhofs, Frank Claessens, Olli A Jänne


The DNA-binding domains (DBDs) of class I steroid receptors-androgen, glucocorticoid, progesterone and mineralocorticoid receptors-recognize a similar cis-element, an inverted repeat of 5′-AGAACA-3′ with a 3-nt spacer. However, these receptors regulate transcription programs that are largely receptor-specific.

To address the role of the DBD in and of itself in ensuring specificity of androgen receptor (AR) binding to chromatin in vivo, we used SPARKI knock-in mice whose AR DBD has the second zinc finger replaced by that of the glucocorticoid receptor.

Comparison of AR-binding events in epididymides and prostates of wild-type (wt) and SPARKI mice revealed that AR achieves selective chromatin binding through a less stringent sequence requirement for the 3′-hexamer. In particular, a T at position 12 in the second hexamer is dispensable for wt AR but mandatory for SPARKI AR binding, and only a G at position 11 is highly conserved among wt AR-preferred response elements.

Genome-wide AR-binding events agree with the respective transcriptome profiles, in that attenuated AR binding in SPARKI mouse epididymis correlates with blunted androgen response in vivo. Collectively, AR-selective actions in vivo rely on relaxed rather than increased stringency of cis-elements on chromatin. These elements are, in turn, poorly recognized by other class I steroid receptors.


Biswajyoti Sahu, Marko Laakso, Päivi Pihlajamaa, Kristian Ovaska, Ievgenii Sinielnikov, Sampsa Hautaniemi, Olli A Jänne


The forkhead protein FoxA1 has functions other than a pioneer factor, in that its depletion brings about a significant redistribution in the androgen receptor (AR) and glucocorticoid receptor (GR) cistromes.

In this study, we found a novel function for FoxA1 in defining the cell-type specificity of AR- and GR-binding events in a distinct fashion, namely, for AR in LNCaP-1F5 cells and for GR in VCaP cells. We also found different, cell-type and receptor-specific compilations of cis-elements enriched adjacent to the AR- and GR-binding sites.

The AR pathway is central in prostate cancer biology, but the role of GR is poorly known. We find that AR and GR cistromes and transcription programs exhibit significant overlap, and GR regulates a large number of genes considered to be AR pathway-specific. This raises questions about the role of GR in maintaining the AR pathway under androgen-deprived conditions in castration-resistant prostate cancer patients.

However, in the presence of androgen, ligand-occupied GR acts as a partial antiandrogen and attenuates the AR-dependent transcription program.

Kristian Ovaska, Lauri Lyly, Biswajyoti Sahu, Olli A Jänne, Sampsa Hautaniemi


Computational analysis of data produced in deep sequencing (DS) experiments is challenging due to large data volumes and requirements for flexible analysis approaches. Here, we present a mathematical formalism based on set algebra for frequently performed operations in DS data analysis to facilitate translation of biomedical research questions to language amenable for computational analysis.

With the help of this formalism, we implemented the Genomic Region Operation Kit (GROK), which supports various DS-related operations such as preprocessing, filtering, file conversion, and sample comparison.

GROK provides high-level interfaces for R, Python, Lua, and command line, as well as an extension C++ API. It supports major genomic file formats and allows storing custom genomic regions in efficient data structures such as red-black trees and SQL databases.

To demonstrate the utility of GROK, we have characterized the roles of two major transcription factors (TFs) in prostate cancer using data from 10 DS experiments.

Click here for free access to GROK and a related user guide.


Kristian Ovaska, Lauri Lyly, Biswajyoti Sahu, Olli A Jänne, Sampsa Hautaniemi


Computational analysis of data produced in deep sequencing (DS) experiments is challenging due to large data volumes and requirements for flexible analysis approaches. Here, we present a mathematical formalism based on set algebra for frequently performed operations in DS data analysis to facilitate translation of biomedical research questions to language amenable for computational analysis.

With the help of this formalism, we implemented the Genomic Region Operation Kit (GROK), which supports various DS-related operations such as preprocessing, filtering, file conversion, and sample comparison.

GROK provides high-level interfaces for R, Python, Lua, and command line, as well as an extension C++ API. It supports major genomic file formats and allows storing custom genomic regions in efficient data structures such as red-black trees and SQL databases.

To demonstrate the utility of GROK, we have characterized the roles of two major transcription factors (TFs) in prostate cancer using data from 10 DS experiments.

Click here for free access to GROK and a related user guide.

S E Jalava, A Urbanucci, L Latonen, K K Waltering, B Sahu, O A Jänne, J Seppälä, H Lähdesmäki, T L J Tammela, T Visakorpi


The androgen receptor (AR) signaling pathway is involved in the emergence of castration-resistant prostate cancer (CRPC). Here, we identified several androgen-regulated microRNAs (miRNAs) that may contribute to the development of CRPC. Seven miRNAs, miR-21, miR-32, miR-99a, miR-99b, miR-148a, miR-221 and miR-590-5p, were found to be differentially expressed in CRPC compared with benign prostate hyperplasia (BPH) according to microarray analyses.

Significant growth advantage for LNCaP cells transfected with pre-miR-32 and pre-miR-148a was found. miR-32 was demonstrated to reduce apoptosis, whereas miR-148a enhanced proliferation. Androgen regulation of miR-32 and miR-148a was confirmed by androgen stimulation of the LNCaP cells followed by expression analyses. The AR-binding sites in proximity of these miRNAs were demonstrated with chromatin immunoprecipitation (ChIP).

To identify target genes for the miRNAs, mRNA microarray analyses were performed with LNCaP cells transfected with pre-miR-32 and pre-miR-148a. Expression of BTG2 and PIK3IP1 was reduced in the cells transfected with pre-miR-32 and pre-miR-148a, respectively. Also, the protein expression was reduced according to western blot analysis. BTG2 and PIK3IP1 were confirmed to be targets by 3’UTR-luciferase assays.

Finally, immunostainings showed a statistically significant (P<0.0001) reduction of BTG2 protein in CRPCs compared with untreated prostate cancer (PC). The lack of BTG2 staining was also associated (P<0.01) with a short progression-free time in patients who underwent prostatectomy. In conclusion, androgen-regulated miR-32 is overexpressed in CRPC, leading to reduced expression of BTG2. Thus, miR-32 is a potential marker for aggressive disease and is a putative drug target in PC.


Biswajyoti Sahu, Marko Laakso, Kristian Ovaska, Tuomas Mirtti, Johan Lundin, Antti Rannikko, Anna Sankila, Juha-Pekka Turunen, Mikael Lundin, Juho Konsti, Tiina Vesterinen, Stig Nordling, Olli Kallioniemi, Sampsa Hautaniemi, Olli A Jänne


High androgen receptor (AR) level in primary tumour predicts increased prostate cancer-specific mortality. However, the mechanisms that regulate AR function in prostate cancer are poorly known.

We report here a new paradigm for the forkhead protein FoxA1 action in androgen signalling. Besides pioneering the AR pathway, FoxA1 depletion elicited extensive redistribution of AR-binding sites (ARBs) on LNCaP-1F5 cell chromatin that was commensurate with changes in androgen-dependent gene expression signature.

We identified three distinct classes of ARBs and androgen-responsive genes: (i) independent of FoxA1, (ii) pioneered by FoxA1 and (iii) masked by FoxA1 and functional upon FoxA1 depletion. FoxA1 depletion also reprogrammed AR binding in VCaP cells, and glucocorticoid receptor binding and glucocorticoid-dependent signalling in LNCaP-1F5 cells.

Importantly, FoxA1 protein level in primary prostate tumour had significant association to disease outcome; high FoxA1 level was associated with poor prognosis, whereas low FoxA1 level, even in the presence of high AR expression, predicted good prognosis. The role of FoxA1 in androgen signalling and prostate cancer is distinctly different from that in oestrogen signalling and breast cancer.

A Urbanucci, B Sahu, J Seppälä, A Larjo, L M Latonen, K K Waltering, T L J Tammela, R L Vessella, H Lähdesmäki, O A Jänne, T Visakorpi


Androgen receptor (AR) is overexpressed in the majority of castration-resistant prostate cancers (CRPCs). Our goal was to study the effect of AR overexpression on the chromatin binding of the receptor and to identify AR target genes that may be important in the emergence of CRPC.

We have established two sublines of LNCaP prostate cancer (PC) cell line, one overexpressing AR 2-3-fold and the other 4-5-fold compared with the control cells. We used chromatin immunoprecipitation (ChIP) and deep-sequencing (seq) to identify AR-binding sites (ARBSs).

We found that the number of ARBSs and the AR-binding strength were positively associated with the level of AR when cells were stimulated with low concentrations of androgens. In cells overexpressing AR, the chromatin binding of the receptor took place in 100-fold lower concentration of the ligand than in control cells.

We confirmed the association of AR level and chromatin binding in two PC xenografts, one containing AR gene amplification with high AR expression, and the other with low expression.

By combining the ChIP-seq and expression profiling, we identified AR target genes that are upregulated in PC. Of them, the expression of ZWINT, SKP2 (S-phase kinase-associated protein 2 (p45)) and FEN1 (flap structure-specific endonuclease 1) was demonstrated to be increased in CRPC, while the expression of SNAI2 was decreased in both PC and CRPC. FEN1 protein expression was also associated with poor prognosis in prostatectomy-treated patients.

Finally, the knock-down of FEN1 with small interfering RNA inhibited the growth of LNCaP cells. Our data demonstrate that the overexpression of AR sensitizes the receptor binding to chromatin, thus, explaining how AR signaling pathway is reactivated in CRPC cells.

Eui-Ju Hong, Biswajyoti Sahu, Olli A Jänne, Geoffrey L Hammond


Human sex hormone-binding globulin (SHBG) accumulates within the cytoplasm of epithelial cells lining the proximal convoluted tubules of mice expressing human SHBG transgenes. The main ligands of SHBG, testosterone and its metabolite, 5α-dihydrotestosterone (DHT), alter expression of androgen-responsive genes in the kidney.

To determine how intracellular SHBG might influence androgen action, we used a mouse proximal convoluted tubule (PCT) cell line with characteristics of S1/S2 epithelial cells in which human SHBG accumulates. Western blotting revealed that SHBG extracted from PCT cells expressing a human SHBG cDNA (PCT-SHBG) is 5-8 kDa smaller than the SHBG secreted by these cells, due to incomplete N-glycosylation and absence of O-linked oligosaccharides.

PCT-SHBG cells sequester [(3)H]DHT more effectively from culture medium than parental PCT cells, and the presence of SHBG accentuates androgen-dependent activation of a luciferase reporter gene, as well as the endogenous kidney androgen-regulated protein (Kap) gene.

After androgen withdrawal, androgen-induced Kap mRNA levels in PCT-SHBG cells are maintained for more than 2 wk vs 2 d in parental PCT cells. Transcriptome profiling after testosterone or DHT pretreatments, followed by 3 d of steroid withdrawal, also demonstrated that intracellular SHBG enhances androgen-dependent stimulation (e.g. Adh7, Vcam1, Areg, Tnfaip2) or repression (e.g. Cldn2 and Osr2) of many other genes in PCT cells.

In addition, nuclear localization of the androgen receptor is enhanced and retained longer after steroid withdrawal in PCT cells containing functional SHBG. Thus, intracellular SHBG accentuates the uptake of androgens and sustains androgens access to the androgen receptor, especially under conditions of limited androgen supply.


Laura Mikkonen, Päivi Pihlajamaa, Biswajyoti Sahu, Fu-Ping Zhang, Olli A Jänne


The androgen receptor (AR) mediates the effects of male sex steroids. There are major sex differences in lung development and pathologies, including lung cancer. In this report, we show that Ar is mainly expressed in type II pneumocytes and the bronchial epithelium of murine lung and that androgen treatment increases AR protein levels in lung cells.

Androgen administration altered significantly murine lung gene expression profiles; for example, by up-regulating transcripts involved in oxygen transport and down-regulating those in DNA repair and DNA recombination. Androgen exposure also affected the gene expression profile in a human lung adenocarcinoma-derived cell line, A549, by up- or down-regulating significantly some 200 transcripts, including down-regulation of genes involved in cell respiration.

Dexamethasone treatment of A549 cells augmented expression of transcript sets that overlapped in part with those up-regulated by androgen in these cells. Moreover, a human lung cancer tissue array revealed that different lung cancer types are all AR-positive. Our results indicate that adult lung is an AR target tissue and suggest that AR plays a role in lung cancer biology.


Kati K Waltering, Merja A Helenius, Biswajyoti Sahu, Visa Manni, Marika J Linja, Olli A Jänne, Tapio Visakorpi


Androgen receptor (AR) is known to be overexpressed in castration-resistant prostate cancer. To interrogate the functional significance of the AR level, we established two LNCaP cell sublines expressing in a stable fashion two to four times (LNCaP-ARmo) and four to six times (LNCaP-ARhi) higher level of AR than the parental cell line expressing the empty vector (LNCaP-pcDNA3.1).

LNCaP-ARhi cell line grew faster than the control line in low concentrations, especially in 1 nmol/L 5alpha-dihydrotestosterone (DHT). Microarray-based transcript profiling and subsequent unsupervised hierarchical clustering showed that LNCaP-ARhi cells clustered together with VCaP cells, containing endogenous AR gene amplification and overexpression, indicating the central role of AR in the overall regulation of gene expression in prostate cancer cells.

Two hundred forty genes showed >2-fold changes on DHT treatment in LNCaP-ARhi at 4 h time point, whereas only 164 and 52 showed changes in LNCaP-ARmo and LNCaP-pcDNA3.1, respectively. Many androgen-regulated genes were upregulated in LNCaP-ARhi at 10-fold lower concentration of DHT than in control cells.

DHT (1 nmol/L) increased expression of several cell cycle-associated genes in LNCaP-ARhi cells. ChIP-on-chip assay revealed the presence of chromatin binding sites for AR within +/-200 kb of most of these genes. The growth of LNCaP-ARhi cells was also highly sensitive to cyclin-dependent kinase inhibitor, roscovitine, at 1nmol/L DHT.

In conclusion, our results show that overexpression of AR sensitizes castration-resistant prostate cancer cells to the low levels of androgens. The activity of AR signaling pathway is regulated by the levels of both ligand and the receptor.