Using gene map science to evaluate the genetic map and eliminate disease

Genetic News

The insect excretory system contains two organ systems acting in concert: the Malpighian tubules and the hindgut perform essential roles in excretion and ionic and osmotic homeostasis. For over 350 years, these two organs have fascinated biologists as a model of organ structure and function. As part of a recent surge in interest, research on the Malpighian tubules and hindgut of Drosophila have uncovered important paradigms of organ physiology and development. Further, many human disease processes can be modeled in these organs. Here, focusing on discoveries in the past 10 years, we provide an overview of the anatomy and physiology of the Drosophila excretory system. We describe the major developmental events that build these organs during embryogenesis, remodel them during metamorphosis, and repair them following injury. Finally, we highlight the use of the Malpighian tubules and hindgut as accessible models of human disease biology. The Malpighian tubule is a particularly excellent model to study rapid fluid transport, neuroendocrine control of renal function, and modeling of numerous human renal conditions such as kidney stones, while the hindgut provides an outstanding model for processes such as the role of cell chirality in development, nonstem cell–based injury repair, cancer-promoting processes, and communication between the intestine and nervous system.

Gastrulation is fundamental to the development of multicellular animals. Along with neurulation, gastrulation is one of the major processes of morphogenesis in which cells or whole tissues move from the surface of an embryo to its interior. Cell internalization mechanisms that have been discovered to date in Caenorhabditis elegans gastrulation bear some similarity to internalization mechanisms of other systems including Drosophila, Xenopus, and mouse, suggesting that ancient and conserved mechanisms internalize cells in diverse organisms. C. elegans gastrulation occurs at an early stage, beginning when the embryo is composed of just 26 cells, suggesting some promise for connecting the rich array of developmental mechanisms that establish polarity and pattern in embryos to the force-producing mechanisms that change cell shapes and move cells interiorly. Here, we review our current understanding of C. elegans gastrulation mechanisms. We address how cells determine which direction is the interior and polarize with respect to that direction, how cells change shape by apical constriction and internalize, and how the embryo specifies which cells will internalize and when. We summarize future prospects for using this system to discover some of the general principles by which animal cells change shape and internalize during development.

The emergence of large gene expression datasets has revealed the need for improved tools to identify enriched gene categories and visualize enrichment patterns. While gene ontogeny (GO) provides a valuable tool for gene set enrichment analysis, it has several limitations. First, it is difficult to graph multiple GO analyses for comparison. Second, genes from some model systems are not well represented. For example, ~30% of Caenorhabditis elegans genes are missing from the analysis in commonly used databases. To allow categorization and visualization of enriched C. elegans gene sets in different types of genome-scale data, we developed WormCat, a web-based tool that uses a near-complete annotation of the C. elegans genome to identify coexpressed gene sets and scaled heat map for enrichment visualization. We tested the performance of WormCat using a variety of published transcriptomic datasets, and show that it reproduces major categories identified by GO. Importantly, we also found previously unidentified categories that are informative for interpreting phenotypes or predicting biological function. For example, we analyzed published RNA-seq data from C. elegans treated with combinations of lifespan-extending drugs, where one combination paradoxically shortened lifespan. Using WormCat, we identified sterol metabolism as a category that was not enriched in the single or double combinations, but emerged in a triple combination along with the lifespan shortening. Thus, WormCat identified a gene set with potential. phenotypic relevance not found with previous GO analysis. In conclusion, WormCat provides a powerful tool for the analysis and visualization of gene set enrichment in different types of C. elegans datasets.

Standard methods for case-control association studies of rare variation often treat disease outcome as a dichotomous phenotype. However, both theoretical and experimental studies have demonstrated that subjects with a family history of disease can be enriched for risk variation relative to subjects without such history. Assuming family history information is available, this observation motivates the idea of replacing the standard dichotomous outcome variable used in case-control studies with a more informative ordinal outcome variable that distinguishes controls (0), sporadic cases (1), and cases with a family history (2), with the expectation that we should observe increasing number of risk variants with increasing category of the ordinal variable. To leverage this expectation, we propose a novel rare-variant association test that incorporates family history information based on our previous GAMuT framework for rare-variant association testing of multivariate phenotypes. We use simulated data to show that, when family history information is available, our new method outperforms standard rare-variant association methods, like burden and SKAT tests, that ignore family history. We further illustrate our method using a rare-variant study of cleft lip and palate.

A multiple-trait Bayesian LASSO (MBL) for genome-based analysis and prediction of quantitative traits is presented and applied to two real data sets. The data-generating model is a multivariate linear Bayesian regression on possibly a huge number of molecular markers, and with a Gaussian residual distribution posed. Each (one per marker) of the vectors of regression coefficients (T: number of traits) is assigned the same T–variate Laplace prior distribution, with a null mean vector and unknown scale matrix . The multivariate prior reduces to that of the standard univariate Bayesian LASSO when The covariance matrix of the residual distribution is assigned a multivariate Jeffreys prior, and is given an inverse-Wishart prior. The unknown quantities in the model are learned using a Markov chain Monte Carlo sampling scheme constructed using a scale-mixture of normal distributions representation. MBL is demonstrated in a bivariate context employing two publicly available data sets using a bivariate genomic best linear unbiased prediction model (GBLUP) for benchmarking results. The first data set is one where wheat grain yields in two different environments are treated as distinct traits. The second data set comes from genotyped Pinus trees, with each individual measured for two traits: rust bin and gall volume. In MBL, the bivariate marker effects are shrunk differentially, i.e., "short" vectors are more strongly shrunk toward the origin than in GBLUP; conversely, "long" vectors are shrunk less. A predictive comparison was carried out as well in wheat, where the comparators of MBL were bivariate GBLUP and bivariate Bayes C—a variable selection procedure. A training-testing layout was used, with 100 random reconstructions of training and testing sets. For the wheat data, all methods produced similar predictions. In Pinus, MBL gave better predictions that either a Bayesian bivariate GBLUP or the single trait Bayesian LASSO. MBL has been implemented in the Julia language package JWAS, and is now available for the scientific community to explore with different traits, species, and environments. It is well known that there is no universally best prediction machine, and MBL represents a new resource in the armamentarium for genome-enabled analysis and prediction of complex traits.

The Escherichia coli system of Cairns and Foster employs a lac frameshift mutation that reverts rarely (10–9/cell/division) during unrestricted growth. However, when 108 cells are plated on lactose medium, the nongrowing lawn produces ~50 Lac+ revertant colonies that accumulate linearly with time over 5 days. Revertants carry very few associated mutations. This behavior has been attributed to an evolved mechanism ("adaptive mutation" or "stress-induced mutagenesis") that responds to starvation by preferentially creating mutations that improve growth. We describe an alternative model, "selective inbreeding," in which natural selection acts during intercellular transfer of the plasmid that carries the mutant lac allele and the dinB gene for an error-prone polymerase. Revertant genome sequences show that the plasmid is more intensely mutagenized than the chromosome. Revertants vary widely in their number of plasmid and chromosomal mutations. Plasmid mutations are distributed evenly, but chromosomal mutations are focused near the replication origin. Rare, heavily mutagenized, revertants have acquired a plasmid tra mutation that eliminates conjugation ability. These findings support the new model, in which revertants are initiated by rare pre-existing cells (105) with many copies of the F’lac plasmid. These cells divide under selection, producing daughters that mate. Recombination between donor and recipient plasmids initiates rolling-circle plasmid over-replication, causing a mutagenic elevation of DinB level. A lac+ reversion event starts chromosome replication and mutagenesis by accumulated DinB. After reversion, plasmid transfer moves the revertant lac+ allele into an unmutagenized cell, and away from associated mutations. Thus, natural selection explains why mutagenesis appears stress-induced and directed.

Meier-Gorlin syndrome is a rare recessive disorder characterized by a number of distinct tissue-specific developmental defects. Genes encoding members of the origin recognition complex (ORC) and additional proteins essential for DNA replication (CDC6, CDT1, GMNN, CDC45, MCM5, and DONSON) are mutated in individuals diagnosed with MGS. The essential role of ORC is to license origins during the G1 phase of the cell cycle, but ORC has also been implicated in several nonreplicative functions. Because of its essential role in DNA replication, ORC is required for every cell division during development. Thus, it is unclear how the Meier-Gorlin syndrome mutations in genes encoding ORC lead to the tissue-specific defects associated with the disease. To begin to address these issues, we used Cas9-mediated genome engineering to generate a Drosophila melanogaster model of individuals carrying a specific Meier-Gorlin syndrome mutation in ORC4 along with control strains. Together these strains provide the first metazoan model for an MGS mutation in which the mutation was engineered at the endogenous locus along with precisely defined control strains. Flies homozygous for the engineered MGS allele reach adulthood, but with several tissue-specific defects. Genetic analysis revealed that this Orc4 allele was a hypomorph. Mutant females were sterile, and phenotypic analyses suggested that defects in DNA replication was an underlying cause. By leveraging the well-studied Drosophila system, we provide evidence that a disease-causing mutation in Orc4 disrupts DNA replication, and we propose that in individuals with MGS defects arise preferentially in tissues with a high-replication demand.

The challenges of breeding autotetraploid potato (Solanum tuberosum) have motivated the development of alternative breeding strategies. A common approach is to obtain uniparental dihaploids from a tetraploid of interest through pollination with S. tuberosum Andigenum Group (formerly S. phureja) cultivars. The mechanism underlying haploid formation of these crosses is unclear, and questions regarding the frequency of paternal DNA transmission remain. Previous reports have described aneuploid and euploid progeny that, in some cases, displayed genetic markers from the haploid inducer (HI). Here, we surveyed a population of 167 presumed dihaploids for large-scale structural variation that would underlie chromosomal addition from the HI, and for small-scale introgression of genetic markers. In 19 progeny, we detected 10 of the 12 possible trisomies and, in all cases, demonstrated the noninducer parent origin of the additional chromosome. Deep sequencing indicated that occasional, short-tract signals appearing to be of HI origin were better explained as technical artifacts. Leveraging recurring copy number variation patterns, we documented subchromosomal dosage variation indicating segregation of polymorphic maternal haplotypes. Collectively, 52% of the assayed chromosomal loci were classified as dosage variable. Our findings help elucidate the genomic consequences of potato haploid induction and suggest that most potato dihaploids will be free of residual pollinator DNA.

Endocrine-disrupting chemicals are ubiquitously present in our environment, but the mechanisms by which they adversely affect human reproductive health and strategies to circumvent their effects remain largely unknown. Here, we show in Caenorhabditis elegans that supplementation with the antioxidant Coenzyme Q10 (CoQ10) rescues the reprotoxicity induced by the widely used plasticizer and endocrine disruptor bisphenol A (BPA), in part by neutralizing DNA damage resulting from oxidative stress. CoQ10 significantly reduces BPA-induced elevated levels of germ cell apoptosis, phosphorylated checkpoint kinase 1 (CHK-1), double-strand breaks (DSBs), and chromosome defects in diakinesis oocytes. BPA-induced oxidative stress, mitochondrial dysfunction, and increased gene expression of antioxidant enzymes in the germline are counteracted by CoQ10. Finally, CoQ10 treatment also reduced the levels of aneuploid embryos and BPA-induced defects observed in early embryonic divisions. We propose that CoQ10 may counteract BPA-induced reprotoxicity through the scavenging of reactive oxygen species and free radicals, and that this natural antioxidant could constitute a low-risk and low-cost strategy to attenuate the impact on fertility by BPA.

Amino acid substitutions are commonly found in human transcription factors, yet the functional consequences of much of this variation remain unknown, even in well-characterized DNA-binding domains. Here, we examine how six single-amino acid variants in the DNA-binding domain of Ste12—a yeast transcription factor regulating mating and invasion—alter Ste12 genome binding, motif recognition, and gene expression to yield markedly different phenotypes. Using a combination of the "calling-card" method, RNA sequencing, and HT-SELEX (high throughput systematic evolution of ligands by exponential enrichment), we find that variants with dissimilar binding and expression profiles can converge onto similar cellular behaviors. Mating-defective variants led to decreased expression of distinct subsets of genes necessary for mating. Hyper-invasive variants also decreased expression of subsets of genes involved in mating, but increased the expression of other subsets of genes associated with the cellular response to osmotic stress. While single-amino acid changes in the coding region of this transcription factor result in complex regulatory reconfiguration, the major phenotypic consequences for the cell appear to depend on changes in the expression of a small number of genes with related functions.

The mitochondrial unfolded protein response (UPRmt) is an evolutionarily conserved adaptive response that functions to maintain mitochondrial homeostasis following mitochondrial damage. In Caenorhabditis elegans, the nervous system plays a central role in responding to mitochondrial stress by releasing endocrine signals that act upon distal tissues to activate the UPRmt. The mechanisms by which mitochondrial stress is sensed by neurons and transmitted to distal tissues are not fully understood. Here, we identify a role for the conserved follicle-stimulating hormone G protein-coupled receptor, FSHR-1, in promoting UPRmt activation. Genetic deficiency of fshr-1 severely attenuates UPRmt activation and organism-wide survival in response to mitochondrial stress. FSHR-1 functions in a common genetic pathway with SPHK-1/sphingosine kinase to promote UPRmt activation, and FSHR-1 regulates the mitochondrial association of SPHK-1 in the intestine. Through tissue-specific rescue assays, we show that FSHR-1 functions in neurons to activate the UPRmt, to promote mitochondrial association of SPHK-1 in the intestine, and to promote organism-wide survival in response to mitochondrial stress. We propose that FSHR-1 functions cell nonautonomously in neurons to activate UPRmt upstream of SPHK-1 signaling in the intestine.

ABC transporters couple ATP hydrolysis to the transport of substrates across cellular membranes. This protein superfamily has diverse activities resulting from differences in their cargo and subcellular localization. Our work investigates the role of the ABCG family member WHT-2 in the biogenesis of gut granules, a Caenorhabditis elegans lysosome-related organelle. In addition to being required for the accumulation of birefringent material within gut granules, WHT-2 is necessary for the localization of gut granule proteins when trafficking pathways to this organelle are partially disrupted. The role of WHT-2 in gut granule protein targeting is likely linked to its function in Rab GTPase localization. We show that WHT-2 promotes the gut granule association of the Rab32 family member GLO-1 and the endolysosomal RAB-7, identifying a novel function for an ABC transporter. WHT-2 localizes to gut granules where it could play a direct role in controlling Rab localization. Loss of CCZ-1 and GLO-3, which likely function as a guanine nucleotide exchange factor (GEF) for GLO-1, lead to similar disruption of GLO-1 localization. We show that CCZ-1, like GLO-3, is localized to gut granules. WHT-2 does not direct the gut granule association of the GLO-1 GEF and our results point to WHT-2 functioning differently than GLO-3 and CCZ-1. Point mutations in WHT-2 that inhibit its transport activity, but not its subcellular localization, lead to the loss of GLO-1 from gut granules, while other WHT-2 activities are not completely disrupted, suggesting that WHT-2 functions in organelle biogenesis through transport-dependent and transport-independent activities.

Evolutionary relationships between prodomains in the TGF-β family have gone unanalyzed due to a perceived lack of conservation. We developed a novel approach, identified these relationships, and suggest hypotheses for new regulatory mechanisms in TGF-β signaling. First, a quantitative analysis placed each family member from flies, mice, and nematodes into the Activin, BMP, or TGF-β subfamily. Second, we defined the prodomain and ligand via the consensus cleavage site. Third, we generated alignments and trees from the prodomain, ligand, and full-length sequences independently for each subfamily. Prodomain alignments revealed that six structural features of 17 are well conserved: three in the straitjacket and three in the arm. Alignments also revealed unexpected cysteine conservation in the "LTBP-Association region" upstream of the straitjacket and in β8 of the bowtie in 14 proteins from all three subfamilies. In prodomain trees, eight clusters across all three subfamilies were present that were not seen in the ligand or full-length trees, suggesting prodomain-mediated cross-subfamily heterodimerization. Consistency between cysteine conservation and prodomain clustering provides support for heterodimerization predictions. Overall, our analysis suggests that cross-subfamily interactions are more common than currently appreciated and our predictions generate numerous testable hypotheses about TGF-β function and evolution.

XY C57BL/6J (B6) mice harboring a Mus musculus domesticus-type Y chromosome (YPOS), known as B6.YPOS mice, commonly undergo gonadal sex reversal and develop as phenotypic females. In a minority of cases, B6.YPOS males are identified and a proportion of these are fertile. This phenotypic variability on a congenic B6 background has puzzled geneticists for decades. Recently, a B6.YPOS colony was shown to carry a non-B6-derived region of chromosome 11 that protected against B6.YPOS sex reversal. Here. we show that a B6.YPOS colony bred and archived at the MRC Harwell Institute lacks the chromosome 11 modifier but instead harbors an ~37 Mb region containing non-B6-derived segments on chromosome 13. This region, which we call Mod13, protects against B6.YPOS sex reversal in a proportion of heterozygous animals through its positive and negative effects on gene expression during primary sex determination. We discuss Mod13’s influence on the testis determination process and its possible origin in light of sequence similarities to that region in other mouse genomes. Our data reveal that the B6.YPOS sex reversal phenomenon is genetically complex and the explanation of observed phenotypic variability is likely dependent on the breeding history of any local colony.

Adaptation in spatially heterogeneous environments results from the balance between local selection, mutation, and migration. We study the interplay among these different evolutionary forces and demography in a classical two-habitat scenario with asexual reproduction. We develop a new theoretical approach that goes beyond the Adaptive Dynamics framework, and allows us to explore the effect of high mutation rates on the stationary phenotypic distribution. We show that this approach improves the classical Gaussian approximation, and captures accurately the shape of this equilibrium phenotypic distribution in one- and two-population scenarios. We examine the evolutionary equilibrium under general conditions where demography and selection may be nonsymmetric between the two habitats. In particular, we show how migration may increase differentiation in a source–sink scenario. We discuss the implications of these analytic results for the adaptation of organisms with large mutation rates, such as RNA viruses.

The past century has seen substantial theoretical and empirical progress on the genetic basis of adaptation. Over this same period, a pressing need to prevent the evolution of drug resistance has uncovered much about the potential genetic basis of persistence in declining populations. However, we have little theory to predict and generalize how persistence—by sufficiently rapid adaptation—might be realized in this explicitly demographic scenario. Here, we use Fisher’s geometric model with absolute fitness to begin a line of theoretical inquiry into the genetic basis of evolutionary rescue, focusing here on asexual populations that adapt through de novo mutations. We show how the dominant genetic path to rescue switches from a single mutation to multiple as mutation rates and the severity of the environmental change increase. In multi-step rescue, intermediate genotypes that themselves go extinct provide a "springboard" to rescue genotypes. Comparing to a scenario where persistence is assured, our approach allows us to quantify how a race between evolution and extinction leads to a genetic basis of adaptation that is composed of fewer loci of larger effect. We hope this work brings awareness to the impact of demography on the genetic basis of adaptation.

Codon usage bias (CUB), where certain codons are used more frequently than expected by chance, is a ubiquitous phenomenon and occurs across the tree of life. The dominant paradigm is that the proportion of preferred codons is set by weak selection. While experimental changes in codon usage have at times shown large phenotypic effects in contrast to this paradigm, genome-wide population genetic estimates have supported the weak selection model. Here we use deep genomic population sequencing of two Drosophila melanogaster populations to measure selection on synonymous sites in a way that allowed us to estimate the prevalence of both weak and strong purifying selection. We find that selection in favor of preferred codons ranges from weak (|Nes| ~ 1) to strong (|Nes| > 10), with strong selection acting on 10–20% of synonymous sites in preferred codons. While previous studies indicated that selection at synonymous sites could be strong, this is the first study to detect and quantify strong selection specifically at the level of CUB. Further, we find that CUB-associated polymorphism accounts for the majority of strong selection on synonymous sites, with secondary contributions of splicing (selection on alternatively spliced genes, splice junctions, and spliceosome-bound sites) and transcription factor binding. Our findings support a new model of CUB and indicate that the functional importance of CUB, as well as synonymous sites in general, have been underestimated.

Plants integrate internal and external signals to finely coordinate growth and defense for maximal fitness within a complex environment. A common model suggests that growth and defense show a trade-offs relationship driven by energy costs. However, recent studies suggest that the coordination of growth and defense likely involves more conditional and intricate connections than implied by the trade-off model. To explore how a transcription factor (TF) network may coordinate growth and defense, we used a high-throughput phenotyping approach to measure growth and flowering in a set of single and pairwise mutants previously linked to the aliphatic glucosinolate (GLS) defense pathway. Supporting a link between growth and defense, 17 of the 20 tested defense-associated TFs significantly influenced plant growth and/or flowering time. The TFs’ effects were conditional upon the environment and age of the plant, and more critically varied across the growth and defense phenotypes for a given genotype. In support of the coordination model of growth and defense, the TF mutant’s effects on short-chain aliphatic GLS and growth did not display a simple correlation. We propose that large TF networks integrate internal and external signals and separately modulate growth and the accumulation of the defensive aliphatic GLS.



Genetic Markers

You know how an interstate map can guide you from one city to another. A genetic map is like that, and it guides researchers toward their target gene. Just as there are landmarks in interstate maps, there also are landmarks in genetic maps known as genetic markers...
Read More



A process by which genes undergo a structural change.