Using gene map science to evaluate the genetic map and eliminate disease

Genetic News

The persistence of hereditary traits over many generations testifies to the stability of the genetic material. Although the Watson–Crick structure for DNA provided a simple and elegant mechanism for replication, some elementary calculations implied that mistakes due to tautomeric shifts would introduce too many errors to permit this stability. It seemed evident that some additional mechanism(s) to correct such errors must be required. This essay traces the early development of our understanding of such mechanisms. Their key feature is the cutting out of a section of the strand of DNA in which the errors or damage resided, and its replacement by a localized synthesis using the undamaged strand as a template. To the surprise of some of the founders of molecular biology, this understanding derives in large part from studies in radiation biology, a field then considered by many to be irrelevant to studies of gene structure and function. Furthermore, genetic studies suggesting mechanisms of mismatch correction were ignored for almost a decade by biochemists unacquainted or uneasy with the power of such analysis. The collective body of results shows that the double-stranded structure of DNA is critical not only for replication but also as a scaffold for the correction of errors and the removal of damage to DNA. As additional discoveries were made, it became clear that the mechanisms for the repair of damage were involved not only in maintaining the stability of the genetic material but also in a variety of biological phenomena for increasing diversity, from genetic recombination to the immune response.

The tracheal system of insects is a network of epithelial tubules that functions as a respiratory organ to supply oxygen to various target organs. Target-derived signaling inputs regulate stereotyped modes of cell specification, branching morphogenesis, and collective cell migration in the embryonic stage. In the postembryonic stages, the same set of signaling pathways controls highly plastic regulation of size increase and pattern elaboration during larval stages, and cell proliferation and reprograming during metamorphosis. Tracheal tube morphogenesis is also regulated by physicochemical interaction of the cell and apical extracellular matrix to regulate optimal geometry suitable for air flow. The trachea system senses both the external oxygen level and the metabolic activity of internal organs, and helps organismal adaptation to changes in environmental oxygen level. Cellular and molecular mechanisms underlying the high plasticity of tracheal development and physiology uncovered through research on Drosophila are discussed.

Controlling the expression of genes using a binary system involving the yeast GAL4 transcription factor has been a mainstay of Drosophila developmental genetics for nearly 30 years. However, most existing GAL4 expression constructs only function effectively in somatic cells, but not in germ cells during oogenesis, for unknown reasons. A special upstream activation sequence (UAS) promoter, UASp was created that does express during oogenesis, but the need to use different constructs for somatic and female germline cells has remained a significant technical limitation. Here, we show that the expression problem of UASt and many other Drosophila molecular tools in germline cells is caused by their core Hsp70 promoter sequences, which are targeted in female germ cells by Hsp70-directed Piwi-interacting RNAs (piRNAs) generated from endogenous Hsp70 gene sequences. In a genetic background lacking genomic Hsp70 genes and associated piRNAs, UASt-based constructs function effectively during oogenesis. By reducing Hsp70 sequences targeted by piRNAs, we created UASz, which functions better than UASp in the germline and like UASt in somatic cells.

High-throughput sequencing methods that multiplex a large number of individuals have provided a cost-effective approach for discovering genome-wide genetic variation in large populations. These sequencing methods are increasingly being utilized in population genetic studies across a diverse range of species. Two side-effects of these methods, however, are (1) sequencing errors and (2) heterozygous genotypes called as homozygous due to only one allele at a particular locus being sequenced, which occurs when the sequencing depth is insufficient. Both of these errors have a profound effect on the estimation of linkage disequilibrium (LD) and, if not taken into account, lead to inaccurate estimates. We developed a new likelihood method, GUS-LD, to estimate pairwise linkage disequilibrium using low coverage sequencing data that accounts for undercalled heterozygous genotypes and sequencing errors. Our findings show that accurate estimates were obtained using GUS-LD, whereas underestimation of LD results if no adjustment is made for the errors.

Due to issues of practicality and confidentiality of genomic data sharing on a large scale, typically only meta- or mega-analyzed genome-wide association study (GWAS) summary data, not individual-level data, are publicly available. Reanalyses of such GWAS summary data for a wide range of applications have become more and more common and useful, which often require the use of an external reference panel with individual-level genotypic data to infer linkage disequilibrium (LD) among genetic variants. However, with a small sample size in only hundreds, as for the most popular 1000 Genomes Project European sample, estimation errors for LD are not negligible, leading to often dramatically increased numbers of false positives in subsequent analyses of GWAS summary data. To alleviate the problem in the context of association testing for a group of SNPs, we propose an alternative estimator of the covariance matrix with an idea similar to multiple imputation. We use numerical examples based on both simulated and real data to demonstrate the severe problem with the use of the 1000 Genomes Project reference panels, and the improved performance of our new approach.

The histone demethylase LSD1 was originally discovered by removing methyl groups from di- and monomethylated histone H3 lysine 4 (H3K4me2/1). Several studies suggest that LSD1 plays roles in meiosis as well as in the epigenetic regulation of fertility given that, in its absence, there is evidence of a progressive accumulation of H3K4me2 and increased sterility through generations. In addition to the progressive sterility phenotype observed in the mutants, growing evidence for the importance of histone methylation in the regulation of DNA damage repair has attracted more attention to the field in recent years. However, we are still far from understanding the mechanisms by which histone methylation is involved in DNA damage repair, and only a few studies have focused on the roles of histone demethylases in germline maintenance. Here, we show that the histone demethylase LSD1/CeSPR-5 interacts with the Fanconi anemia (FA) protein FANCM/CeFNCM-1 using biochemical, cytological, and genetic analyses. LSD1/CeSPR-5 is required for replication stress-induced S phase-checkpoint activation, and its absence suppresses the embryonic lethality and larval arrest observed in fncm-1 mutants. FANCM/CeFNCM-1 relocalizes upon hydroxyurea exposure and colocalizes with FANCD2/CeFCD-2 and LSD1/CeSPR-5, suggesting coordination between this histone demethylase and FA components to resolve replication stress. Surprisingly, the FA pathway is required for H3K4me2 maintenance, regardless of the presence of replication stress. Our study reveals a connection between FA and epigenetic maintenance and therefore provides new mechanistic insight into the regulation of histone methylation in DNA repair.

In many organisms, telomeric sequences can be located internally on the chromosome in addition to their usual positions at the ends of the chromosome. In humans, such interstitial telomeric sequences (ITSs) are nonrandomly associated with translocation breakpoints in tumor cells and with chromosome fragile sites (regions of the chromosome that break in response to perturbed DNA replication). We previously showed that ITSs in yeast generated several different types of instability, including terminal inversions (recombination between the ITS and the "true" chromosome telomere) and point mutations in DNA sequences adjacent to the ITS. In the current study, we examine the genetic control of these events. We show that the terminal inversions occur by the single-strand annealing pathway of DNA repair following the formation of a double-stranded DNA break within the ITS. The point mutations induced by the ITS require the error-prone DNA polymerase . Unlike the terminal inversions, these events are not initiated by a double-stranded DNA break, but likely result from the error-prone repair of a single-stranded DNA gap or recruitment of DNA polymerase in the absence of DNA damage.

Mismatch repair (MMR) proteins act in spellchecker roles to excise misincorporation errors that occur during DNA replication. Curiously, large-scale analyses of a variety of cancers showed that increased expression of MMR proteins often correlated with tumor aggressiveness, metastasis, and early recurrence. To better understand these observations, we used The Cancer Genome Atlas and Gene Expression across Normal and Tumor tissue databases to analyze MMR protein expression in cancers. We found that the MMR genes MSH2 and MSH6 are overexpressed more frequently than MSH3, and that MSH2 and MSH6 are often cooverexpressed as a result of copy number amplifications of these genes. These observations encouraged us to test the effects of upregulating MMR protein levels in baker’s yeast, where we can sensitively monitor genome instability phenotypes associated with cancer initiation and progression. Msh6 overexpression (two- to fourfold) almost completely disrupted mechanisms that prevent recombination between divergent DNA sequences by interacting with the DNA polymerase processivity clamp PCNA and by sequestering the Sgs1 helicase. Importantly, cooverexpression of Msh2 and Msh6 (~eightfold) conferred, in a PCNA interaction-dependent manner, several genome instability phenotypes including increased mutation rate, increased sensitivity to the DNA replication inhibitor HU and the DNA-damaging agents MMS and 4-nitroquinoline N-oxide, and elevated loss-of-heterozygosity. Msh2 and Msh6 cooverexpression also altered the cell cycle distribution of exponentially growing cells, resulting in an increased fraction of unbudded cells, consistent with a larger percentage of cells in G1. These novel observations suggested that overexpression of MSH factors affected the integrity of the DNA replication fork, causing genome instability phenotypes that could be important for promoting cancer progression.

The mevalonate pathway is the primary target of the cholesterol-lowering drugs statins, some of the most widely prescribed medicines of all time. The pathway’s enzymes not only catalyze the synthesis of cholesterol but also of diverse metabolites such as mitochondrial electron carriers and isoprenyls. Recently, it has been shown that one type of mitochondrial stress response, the UPRmt, can protect yeast, Caenorhabditis elegans, and cultured human cells from the deleterious effects of mevalonate pathway inhibition by statins. The mechanistic basis for this protection, however, remains unknown. Using C. elegans, we found that the UPRmt does not directly affect the levels of the statin target HMG-CoA reductase, the rate-controlling enzyme of the mevalonate pathway in mammals. Instead, in C. elegans the UPRmt upregulates the first dedicated enzyme of the pathway, HMG-CoA synthase (HMGS-1). A targeted RNA interference (RNAi) screen identified two UPRmt transcription factors, ATFS-1 and DVE-1, as regulators of HMGS-1. A comprehensive analysis of the pathway’s enzymes found that, in addition to HMGS-1, the UPRmt upregulates enzymes involved with the biosynthesis of electron carriers and geranylgeranylation intermediates. Geranylgeranylation, in turn, is requisite for the full execution of the UPRmt 3response. Thus, the UPRmt acts in at least three coordinated, compensatory arms to upregulate specific branches of the mevalonate pathway, thereby alleviating mitochondrial stress. We propose that statin-mediated inhibition of the mevalonate pathway blocks this compensatory system of the UPRmt and consequentially impedes mitochondrial homeostasis. This effect is likely one of the principal bases for the adverse side effects of statins.

Homologous recombination is required for proper segregation of homologous chromosomes during meiosis. It occurs predominantly at recombination hotspots that are defined by the DNA binding specificity of the PRDM9 protein. PRDM9 contains three conserved domains typically involved in regulation of transcription; yet, the role of PRDM9 in gene expression control is not clear. Here, we analyze the germline transcriptome of Prdm9–/– male mice in comparison to Prdm9+/+ males and find no apparent differences in the mRNA and miRNA profiles. We further explore the role of PRDM9 in meiosis by analyzing the effect of the KRAB, SSXRD, and post-SET zinc finger deletions in a cell culture expression system and the KRAB domain deletion in mice. We found that although the post-SET zinc finger and the KRAB domains are not essential for the methyltransferase activity of PRDM9 in cell culture, the KRAB domain mutant mice show only residual PRDM9 methyltransferase activity and undergo meiotic arrest. In aggregate, our data indicate that domains typically involved in regulation of gene expression do not serve that role in PRDM9, but are likely involved in setting the proper chromatin environment for initiation and completion of homologous recombination.

Maintenance of cell integrity and cell-to-cell communication are fundamental biological processes. Filamentous fungi, such as Neurospora crassa, depend on communication to locate compatible cells, coordinate cell fusion, and establish a robust hyphal network. Two MAP kinase (MAPK) pathways are essential for communication and cell fusion in N. crassa: the cell wall integrity/MAK-1 pathway and the MAK-2 (signal response) pathway. Previous studies have demonstrated several points of cross-talk between the MAK-1 and MAK-2 pathways, which is likely necessary for coordinating chemotropic growth toward an extracellular signal, and then mediating cell fusion. Canonical MAPK pathways begin with signal reception and end with a transcriptional response. Two transcription factors, ADV-1 and PP-1, are essential for communication and cell fusion. PP-1 is the conserved target of MAK-2, but it is unclear what targets ADV-1. We did RNA sequencing on adv-1, pp-1, and wild-type cells and found that ADV-1 and PP-1 have a shared regulon including many genes required for communication, cell fusion, growth, development, and stress response. We identified ADV-1 and PP-1 binding sites across the genome by adapting the in vitro method of DNA-affinity purification sequencing for N. crassa. To elucidate the regulatory network, we misexpressed each transcription factor in each upstream MAPK deletion mutant. Misexpression of adv-1 was sufficient to fully suppress the phenotype of the pp-1 mutant and partially suppress the phenotype of the mak-1 mutant. Collectively, our data demonstrate that the MAK-1/ADV-1 and MAK-2/PP-1 pathways form a tight regulatory network that maintains cell integrity and mediates communication and cell fusion.

Sterility in hybrid animals is widely known to be due to a cytological mechanism of aberrant homologous chromosome pairing during meiosis in hybrid germ cells. In this study, the gametes of four marine fish species belonging to the Sciaenid family were artificially fertilized, and germ cell development was examined at the cellular and molecular levels. One of the intergeneric hybrids had gonads that were testis-like in structure, small in size, and lacked germ cells. Specification of primordial germ cells (PGCs) and their migration toward genital ridges occurred normally in hybrid embryos, but these PGCs did not proliferate in the hybrid gonads. By germ cell transplantation assay, we showed that the gonadal microenvironment in hybrid recipients produced functional donor-derived gametes, suggesting that the germ cell-less phenotype was caused by cell autonomous proliferative defects of hybrid PGCs. This is the first evidence of mitotic arrest of germ cells causing hybrid sterility in animals.

The heterotrimeric G protein Gq regulates neuronal activity through distinct downstream effector pathways. In addition to the canonical Gq effector phospholipase Cβ, the small GTPase Rho was recently identified as a conserved effector of Gq. To identify additional molecules important for Gq signaling in neurons, we performed a forward genetic screen in the nematode Caenorhabditis elegans for suppressors of the hyperactivity and exaggerated waveform of an activated Gq mutant. We isolated two mutations affecting the MAP kinase scaffold protein KSR-1 and found that KSR-1 modulates locomotion downstream of, or in parallel to, the Gq-Rho pathway. Through epistasis experiments, we found that the core ERK MAPK cascade is required for Gq-Rho regulation of locomotion, but that the canonical ERK activator LET-60/Ras may not be required. Through neuron-specific rescue experiments, we found that the ERK pathway functions in head acetylcholine neurons to control Gq-dependent locomotion. Additionally, expression of activated LIN-45/Raf in head acetylcholine neurons is sufficient to cause an exaggerated waveform phenotype and hypersensitivity to the acetylcholinesterase inhibitor aldicarb, similar to an activated Gq mutant. Taken together, our results suggest that the ERK MAPK pathway modulates the output of Gq-Rho signaling to control locomotion behavior in C. elegans.

Adult stem cells reside in specialized microenvironments called niches, which provide signals for stem cells to maintain their undifferentiated and self-renewing state. To maintain stem cell quality, several types of stem cells are known to be regularly replaced by progenitor cells through niche competition. However, the cellular and molecular bases for stem cell competition for niche occupancy are largely unknown. Here, we show that two Drosophila members of the glypican family of heparan sulfate proteoglycans (HSPGs), Dally and Dally-like (Dlp), differentially regulate follicle stem cell (FSC) maintenance and competitiveness for niche occupancy. Lineage analyses of glypican mutant FSC clones showed that dally is essential for normal FSC maintenance. In contrast, dlp is a hypercompetitive mutation: dlp mutant FSC progenitors often eventually occupy the entire epithelial sheet. RNA interference knockdown experiments showed that Dally and Dlp play both partially redundant and distinct roles in regulating Jak/Stat, Wg, and Hh signaling in FSCs. The Drosophila FSC system offers a powerful genetic model to study the mechanisms by which HSPGs exert specific functions in stem cell replacement and competition.

Replication-independent variant histones replace canonical histones in nucleosomes and act as important regulators of chromatin function. H3.3 is a major variant of histone H3 that is remarkably conserved across taxa and is distinguished from canonical H3 by just four key amino acids. Most genomes contain two or more genes expressing H3.3, and complete loss of the protein usually causes sterility or embryonic lethality. Here, we investigate the developmental expression patterns of the five Caenorhabditis elegans H3.3 homologs and identify two previously uncharacterized homologs to be restricted to the germ line. Despite these specific expression patterns, we find that neither loss of individual H3.3 homologs nor the knockout of all five H3.3-coding genes causes sterility or lethality. However, we demonstrate an essential role for the conserved histone chaperone HIRA in the nucleosomal loading of all H3.3 variants. This requirement can be bypassed by mutation of the H3.3-specific residues to those found in H3. While even removal of all H3.3 homologs does not result in lethality, it leads to reduced fertility and viability in response to high-temperature stress. Thus, our results show that H3.3 is nonessential in C. elegans but is critical for ensuring adequate response to stress.

Multiple species within the basidiomycete genus Cryptococcus cause cryptococcal disease. These species are estimated to affect nearly a quarter of a million people leading to ~180,000 mortalities, annually. Sexual reproduction, which can occur between haploid yeasts of the same or opposite mating type, is a potentially important contributor to pathogenesis as recombination can generate novel genotypes and transgressive phenotypes. However, our quantitative understanding of recombination in this clinically important yeast is limited. Here, we describe genome-wide estimates of recombination rates in Cryptococcus deneoformans and compare recombination between progeny from α–α unisexual and a–α bisexual crosses. We find that offspring from bisexual crosses have modestly higher average rates of recombination than those derived from unisexual crosses. Recombination hot and cold spots across the C. deneoformans genome are also identified and are associated with increased GC content. Finally, we observed regions genome-wide with allele frequencies deviating from the expected parental ratio. These findings and observations advance our quantitative understanding of the genetic events that occur during sexual reproduction in C. deneoformans, and the impact that different forms of sexual reproduction are likely to have on genetic diversity in this important fungal pathogen.

It has been challenging to determine the disease-causing variant(s) for most major histocompatibility complex (MHC)-associated diseases. However, it is becoming increasingly clear that regulatory variation is pervasive and a fundamentally important mechanism governing phenotypic diversity and disease susceptibility. We gathered DNase I data from 136 human cells to characterize the regulatory landscape of the MHC region, including 4867 DNase I hypersensitive sites (DHSs). We identified thousands of regulatory elements that have been gained or lost in the human or chimpanzee genomes since their evolutionary divergence. We compared alignments of the DHS across six primates and found 149 DHSs with convincing evidence of positive and/or purifying selection. Of these DHSs, compared to neutral sequences, 24 evolved rapidly in the human lineage. We identified 15 instances of transcription-factor-binding motif gains, such as USF, MYC, MAX, MAFK, STAT1, PBX3, etc., and observed 16 GWAS (genome-wide association study) SNPs associated with diseases within these 24 DHSs using FIMO (Find Individual Motif Occurrences) and UCSC (University of California, Santa Cruz) ChIP-seq data. Combining eQTL and Hi-C data, our results indicated that there were five SNPs located in human gains motifs affecting the corresponding gene’s expression, two of which closely matched DHS target genes. In addition, a significant SNP, rs7756521, at genome-wide significant level likely affects DDR expression and represents a causal genetic variant for HIV-1 control. These results indicated that species-specific motif gains or losses of rapidly evolving DHSs in the primate genomes might play a role during adaptation evolution and provided some new evidence for a potentially causal role for these GWAS SNPs.

In nature, multiple adaptive phenotypes often coevolve and can be controlled by tightly linked genetic loci known as supergenes. Dissecting the genetic basis of these linked phenotypes is a major challenge in evolutionary genetics. Multiple freshwater populations of threespine stickleback fish (Gasterosteus aculeatus) have convergently evolved two constructive craniofacial traits, longer branchial bones and increased pharyngeal tooth number, likely as adaptations to dietary differences between marine and freshwater environments. Prior QTL mapping showed that both traits are partially controlled by overlapping genomic regions on chromosome 21 and that a regulatory change in Bmp6 likely underlies the tooth number QTL. Here, we mapped the branchial bone length QTL to a 155 kb, eight-gene interval tightly linked to, but excluding the coding regions of Bmp6 and containing the candidate gene Tfap2a. Further recombinant mapping revealed this bone length QTL is separable into at least two loci. During embryonic and larval development, Tfap2a was expressed in the branchial bone primordia, where allele specific expression assays revealed the freshwater allele of Tfap2a was expressed at lower levels relative to the marine allele in hybrid fish. Induced loss-of-function mutations in Tfap2a revealed an essential role in stickleback craniofacial development and show that bone length is sensitive to Tfap2a dosage in heterozygotes. Combined, these results suggest that closely linked but genetically separable changes in Bmp6 and Tfap2a contribute to a supergene underlying evolved skeletal gain in multiple freshwater stickleback populations.

Small molecule lipid-related metabolites are important components of fatty acid and steroid metabolism—two important contributors to human health. This study investigated the extent to which rare and common genetic variants spanning the human genome influence the lipid-related metabolome. Sequence data from 1552 European-Americans (EA) and 1872 African-Americans (AA) were analyzed to examine the impact of common and rare variants on the levels of 102 circulating lipid-related metabolites measured by a combination of chromatography and mass spectroscopy. We conducted single variant tests [minor allele frequency (MAF) > 5%, statistical significance P-value ≤ 2.45 10–10] and tests aggregating rare variants (MAF ≤ 5%) across multiple genomic motifs, such as coding regions and regulatory domains, and sliding windows. Multiethnic meta-analyses detected 53 lipid-related metabolites-locus pairs, which were inspected for evidence of consistent signal between the two ethnic groups. Thirty-eight lipid-related metabolite-genomic region associations were consistent across ethnicities, among which seven were novel. The regions contain genes that are related to metabolite transport (SLC10A1) and metabolism (SCD, FDX1, UGT2B15, and FADS2). Six of the seven novel findings lie in expression quantitative trait loci affecting the expression levels of 14 surrounding genes in multiple tissues. Imputed expression levels of 10 of the affected genes were associated with four corresponding lipid-related traits in at least one tissue. Our findings offer valuable insight into circulating lipid-related metabolite regulation in a multiethnic population.

Genome comparisons provide information on the nature of genetic change, but such comparisons are challenged to differentiate the importance of the actual sequence change processes relative to the role of selection. This problem can be overcome by identifying changes that have not yet had the time to undergo millions of years of natural selection. We describe a strategy to discover accession-specific changes in the rice genome using an abundant resource routinely provided for many genome analyses, resequencing data. The sequence of the fully sequenced rice genome from variety Nipponbare was compared to the pooled (~114x) resequencing data from 126 japonica rice accessions to discover "Nipponbare-specific" sequences. Analyzing nonrepetitive sequences, 8504 "candidate" Nipponbare-specific changes were detected, of which around two-thirds are true novel sequence changes and the rest are predicted genome sequencing errors. Base substitutions outnumbered indels in this data set by > 28:1, with ~8:5 bias toward transversions over transitions, and no transposable element insertions or excisions were observed. These results indicate that the strategy employed is effective for finding recent sequence changes, sequencing errors, and rare alleles in any organism that has both a reference genome sequence and a wealth of resequencing data.



Genetic Ethics

The advances in genetic mapping have made very real what seemed so improbable twenty years ago. ... Genetic mapping is a powerful tool ... but it is also vulnerable to abuse. Many ethical, legal and societal issues are beginning to emerge...
Read More



All of the genes carried by a single gamete; the DNA content of an individual, which includes all 44 autosomes, 2 sex chromosomes, and the mitochondrial DNA.