Using gene map science to evaluate the genetic map and eliminate disease

Genetic News

The Collaborative Cross (CC) is a multiparent panel of recombinant inbred (RI) mouse strains derived from eight founder laboratory strains. RI panels are popular because of their long-term genetic stability, which enhances reproducibility and integration of data collected across time and conditions. Characterization of their genomes can be a community effort, reducing the burden on individual users. Here we present the genomes of the CC strains using two complementary approaches as a resource to improve power and interpretation of genetic experiments. Our study also provides a cautionary tale regarding the limitations imposed by such basic biological processes as mutation and selection. A distinct advantage of inbred panels is that genotyping only needs to be performed on the panel, not on each individual mouse. The initial CC genome data were haplotype reconstructions based on dense genotyping of the most recent common ancestors (MRCAs) of each strain followed by imputation from the genome sequence of the corresponding founder inbred strain. The MRCA resource captured segregating regions in strains that were not fully inbred, but it had limited resolution in the transition regions between founder haplotypes, and there was uncertainty about founder assignment in regions of limited diversity. Here we report the whole genome sequence of 69 CC strains generated by paired-end short reads at 30x coverage of a single male per strain. Sequencing leads to a substantial improvement in the fine structure and completeness of the genomes of the CC. Both MRCAs and sequenced samples show a significant reduction in the genome-wide haplotype frequencies from two wild-derived strains, CAST/EiJ and PWK/PhJ. In addition, analysis of the evolution of the patterns of heterozygosity indicates that selection against three wild-derived founder strains played a significant role in shaping the genomes of the CC. The sequencing resource provides the first description of tens of thousands of new genetic variants introduced by mutation and drift in the CC genomes. We estimate that new SNP mutations are accumulating in each CC strain at a rate of 2.4 ± 0.4 per gigabase per generation. The fixation of new mutations by genetic drift has introduced thousands of new variants into the CC strains. The majority of these mutations are novel compared to currently sequenced laboratory stocks and wild mice, and some are predicted to alter gene function. Approximately one-third of the CC inbred strains have acquired large deletions (>10 kb) many of which overlap known coding genes and functional elements. The sequence of these mice is a critical resource to CC users, increases threefold the number of mouse inbred strain genomes available publicly, and provides insight into the effect of mutation and drift on common resources.

The goal of the Collaborative Cross (CC) project was to generate and distribute over 1000 independent mouse recombinant inbred strains derived from eight inbred founders. With inbreeding nearly complete, we estimated the extinction rate among CC lines at a remarkable 95%, which is substantially higher than in the derivation of other mouse recombinant inbred populations. Here, we report genome-wide allele frequencies in 347 extinct CC lines. Contrary to expectations, autosomes had equal allelic contributions from the eight founders, but chromosome X had significantly lower allelic contributions from the two inbred founders with underrepresented subspecific origins (PWK/PhJ and CAST/EiJ). By comparing extinct CC lines to living CC strains, we conclude that a complex genetic architecture is driving extinction, and selection pressures are different on the autosomes and chromosome X. Male infertility played a large role in extinction as 47% of extinct lines had males that were infertile. Males from extinct lines had high variability in reproductive organ size, low sperm counts, low sperm motility, and a high rate of vacuolization of seminiferous tubules. We performed QTL mapping and identified nine genomic regions associated with male fertility and reproductive phenotypes. Many of the allelic effects in the QTL were driven by the two founders with underrepresented subspecific origins, including a QTL on chromosome X for infertility that was driven by the PWK/PhJ haplotype. We also performed the first example of cross validation using complementary CC resources to verify the effect of sperm curvilinear velocity from the PWK/PhJ haplotype on chromosome 2 in an independent population across multiple generations. While selection typically constrains the examination of reproductive traits toward the more fertile alleles, the CC extinct lines provided a unique opportunity to study the genetic architecture of fertility in a widely genetically variable population. We hypothesize that incompatibilities between alleles with different subspecific origins is a key driver of infertility. These results help clarify the factors that drove strain extinction in the CC, reveal the genetic regions associated with poor fertility in the CC, and serve as a resource to further study mammalian infertility.

Adaptation of domesticated species to diverse agroclimatic regions has led to abundant trait diversity. However, the resulting population structure and genetic heterogeneity confounds association mapping of adaptive traits. To address this challenge in sorghum [Sorghum bicolor (L.) Moench]—a widely adapted cereal crop—we developed a nested association mapping (NAM) population using 10 diverse global lines crossed with an elite reference line RTx430. We characterized the population of 2214 recombinant inbred lines at 90,000 SNPs using genotyping-by-sequencing. The population captures ~70% of known global SNP variation in sorghum, and 57,411 recombination events. Notably, recombination events were four- to fivefold enriched in coding sequences and 5' untranslated regions of genes. To test the power of the NAM population for trait dissection, we conducted joint linkage mapping for two major adaptive traits, flowering time and plant height. We precisely mapped several known genes for these two traits, and identified several additional QTL. Considering all SNPs simultaneously, genetic variation accounted for 65% of flowering time variance and 75% of plant height variance. Further, we directly compared NAM to genome-wide association mapping (using panels of the same size) and found that flowering time and plant height QTL were more consistently identified with the NAM population. Finally, for simulated QTL under strong selection in diversity panels, the power of QTL detection was up to three times greater for NAM vs. association mapping with a diverse panel. These findings validate the NAM resource for trait mapping in sorghum, and demonstrate the value of NAM for dissection of adaptive traits.

The nutritional environments that organisms experience are inherently variable, requiring tight coordination of how resources are allocated to different functions relative to the total amount of resources available. A growing body of evidence supports the hypothesis that key endocrine pathways play a fundamental role in this coordination. In particular, the insulin/insulin-like growth factor signaling (IIS) and target of rapamycin (TOR) pathways have been implicated in nutrition-dependent changes in metabolism and nutrient allocation. However, little is known about the genetic basis of standing variation in IIS/TOR or how diet-dependent changes in expression in this pathway influence phenotypes related to resource allocation. To characterize natural genetic variation in the IIS/TOR pathway, we used >250 recombinant inbred lines (RILs) derived from a multiparental mapping population, the Drosophila Synthetic Population Resource, to map transcript-level QTL of genes encoding 52 core IIS/TOR components in three different nutritional environments [dietary restriction (DR), control (C), and high sugar (HS)]. Nearly all genes, 87%, were significantly differentially expressed between diets, though not always in ways predicted by loss-of-function mutants. We identified cis (i.e., local) expression QTL (eQTL) for six genes, all of which are significant in multiple nutrient environments. Further, we identified trans (i.e., distant) eQTL for two genes, specific to a single nutrient environment. Our results are consistent with many small changes in the IIS/TOR pathways. A discriminant function analysis for the C and DR treatments identified a pattern of gene expression associated with the diet treatment. Mapping the composite discriminant function scores revealed a significant global eQTL within the DR diet. A correlation between the discriminant function scores and the median life span (r = 0.46) provides evidence that gene expression changes in response to diet are associated with longevity in these RILs.

Meiotic recombination is an essential feature of sexual reproduction that ensures faithful segregation of chromosomes and redistributes genetic variants in populations. Multiparent populations such as the Diversity Outbred (DO) mouse stock accumulate large numbers of crossover (CO) events between founder haplotypes, and thus present a unique opportunity to study the role of genetic variation in shaping the recombination landscape. We obtained high-density genotype data from $$6886$$ DO mice, and localized 2.2 million CO events to intervals with a median size of 28 kb. The resulting sex-averaged genetic map of the DO population is highly concordant with large-scale (order 10 Mb) features of previously reported genetic maps for mouse. To examine fine-scale (order 10 kb) patterns of recombination in the DO, we overlaid putative recombination hotspots onto our CO intervals. We found that CO intervals are enriched in hotspots compared to the genomic background. However, as many as $$26\%$$ of CO intervals do not overlap any putative hotspots, suggesting that our understanding of hotspots is incomplete. We also identified coldspots encompassing 329 Mb, or $$12\%$$ of observable genome, in which there is little or no recombination. In contrast to hotspots, which are a few kilobases in size, and widely scattered throughout the genome, coldspots have a median size of 2.1 Mb and are spatially clustered. Coldspots are strongly associated with copy-number variant (CNV) regions, especially multi-allelic clusters, identified from whole-genome sequencing of 228 DO mice. Genes in these regions have reduced expression, and epigenetic features of closed chromatin in male germ cells, which suggests that CNVs may repress recombination by altering chromatin structure in meiosis. Our findings demonstrate how multiparent populations, by bridging the gap between large-scale and fine-scale genetic mapping, can reveal new features of the recombination landscape.

Genetic studies of multidimensional phenotypes can potentially link genetic variation, gene expression, and physiological data to create multi-scale models of complex traits. The challenge of reducing these data to specific hypotheses has become increasingly acute with the advent of genome-scale data resources. Multi-parent populations derived from model organisms provide a resource for developing methods to understand this complexity. In this study, we simultaneously modeled body composition, serum biomarkers, and liver transcript abundances from 474 Diversity Outbred mice. This population contained both sexes and two dietary cohorts. Transcript data were reduced to functional gene modules with weighted gene coexpression network analysis (WGCNA), which were used as summary phenotypes representing enriched biological processes. These module phenotypes were jointly analyzed with body composition and serum biomarkers in a combined analysis of pleiotropy and epistasis (CAPE), which inferred networks of epistatic interactions between quantitative trait loci that affect one or more traits. This network frequently mapped interactions between alleles of different ancestries, providing evidence of both genetic synergy and redundancy between haplotypes. Furthermore, a number of loci interacted with sex and diet to yield sex-specific genetic effects and alleles that potentially protect individuals from the effects of a high-fat diet. Although the epistatic interactions explained small amounts of trait variance, the combination of directional interactions, allelic specificity, and high genomic resolution provided context to generate hypotheses for the roles of specific genes in complex traits. Our approach moves beyond the cataloging of single loci to infer genetic networks that map genetic etiology by simultaneously modeling all phenotypes.

Max Delbrück was trained as a physicist but made his major contribution in biology and ultimately shared a Nobel Prize in Physiology or Medicine. He was the acknowledged leader of the founders of molecular biology, yet he failed to achieve his key scientific goals. His ultimate scientific aim was to find evidence for physical laws unique to biology: so-called "complementarity." He never did. The specific problem he initially wanted to solve was the nature of biological replication but the discovery of the mechanism of replication was made by others, in large part because of his disdain for the details of biochemistry. His later career was spent investigating the effect of light on the fungus Phycomyces, a topic that turned out to be of limited general interest. He was known both for his informality but also for his legendary displays of devastating criticism. His life and that of some of his closest colleagues was acted out against a background of a world in conflict. This essay describes the man and his career and searches for an explanation of his profound influence.

Systematic genetic studies of a handful of diverse organisms over the past 50 years have transformed our understanding of biology. However, many aspects of primate biology, behavior, and disease are absent or poorly modeled in any of the current genetic model organisms including mice. We surveyed the animal kingdom to find other animals with advantages similar to mice that might better exemplify primate biology, and identified mouse lemurs (Microcebus spp.) as the outstanding candidate. Mouse lemurs are prosimian primates, roughly half the genetic distance between mice and humans. They are the smallest, fastest developing, and among the most prolific and abundant primates in the world, distributed throughout the island of Madagascar, many in separate breeding populations due to habitat destruction. Their physiology, behavior, and phylogeny have been studied for decades in laboratory colonies in Europe and in field studies in Malagasy rainforests, and a high quality reference genome sequence has recently been completed. To initiate a classical genetic approach, we developed a deep phenotyping protocol and have screened hundreds of laboratory and wild mouse lemurs for interesting phenotypes and begun mapping the underlying mutations, in collaboration with leading mouse lemur biologists. We also seek to establish a mouse lemur gene "knockout" library by sequencing the genomes of thousands of mouse lemurs to identify null alleles in most genes from the large pool of natural genetic variants. As part of this effort, we have begun a citizen science project in which students across Madagascar explore the remarkable biology around their schools, including longitudinal studies of the local mouse lemurs. We hope this work spawns a new model organism and cultivates a deep genetic understanding of primate biology and health. We also hope it establishes a new and ethical method of genetics that bridges biological, behavioral, medical, and conservation disciplines, while providing an example of how hands-on science education can help transform developing countries.

The purpose of this chapter in FlyBook is to acquaint the reader with the Drosophila genome and the ways in which it can be altered by mutation. Much of what follows will be familiar to the experienced Fly Pusher but hopefully will be useful to those just entering the field and are thus unfamiliar with the genome, the history of how it has been and can be altered, and the consequences of those alterations. I will begin with the structure, content, and organization of the genome, followed by the kinds of structural alterations (karyotypic aberrations), how they affect the behavior of chromosomes in meiotic cell division, and how that behavior can be used. Finally, screens for mutations as they have been performed will be discussed. There are several excellent sources of detailed information on Drosophila husbandry and screening that are recommended for those interested in further expanding their familiarity with Drosophila as a research tool and model organism. These are a book by Ralph Greenspan and a review article by John Roote and Andreas Prokop, which should be required reading for any new student entering a fly lab for the first time.

The hermaphroditic nematode Caenorhabditis elegans has been one of the primary model systems in biology since the 1970s, but only within the last two decades has this nematode also become a useful model for experimental evolution. Here, we outline the goals and major foci of experimental evolution with C. elegans and related species, such as C. briggsae and C. remanei, by discussing the principles of experimental design, and highlighting the strengths and limitations of Caenorhabditis as model systems. We then review three exemplars of Caenorhabditis experimental evolution studies, underlining representative evolution experiments that have addressed the: (1) maintenance of genetic variation; (2) role of natural selection during transitions from outcrossing to selfing, as well as the maintenance of mixed breeding modes during evolution; and (3) evolution of phenotypic plasticity and its role in adaptation to variable environments, including host–pathogen coevolution. We conclude by suggesting some future directions for which experimental evolution with Caenorhabditis would be particularly informative.

Considerable progress in our understanding of yeast genomes and their evolution has been made over the last decade with the sequencing, analysis, and comparisons of numerous species, strains, or isolates of diverse origins. The role played by yeasts in natural environments as well as in artificial manufactures, combined with the importance of some species as model experimental systems sustained this effort. At the same time, their enormous evolutionary diversity (there are yeast species in every subphylum of Dikarya) sparked curiosity but necessitated further efforts to obtain appropriate reference genomes. Today, yeast genomes have been very informative about basic mechanisms of evolution, speciation, hybridization, domestication, as well as about the molecular machineries underlying them. They are also irreplaceable to investigate in detail the complex relationship between genotypes and phenotypes with both theoretical and practical implications. This review examines these questions at two distinct levels offered by the broad evolutionary range of yeasts: inside the best-studied Saccharomyces species complex, and across the entire and diversified subphylum of Saccharomycotina. While obviously revealing evolutionary histories at different scales, data converge to a remarkably coherent picture in which one can estimate the relative importance of intrinsic genome dynamics, including gene birth and loss, vs. horizontal genetic accidents in the making of populations. The facility with which novel yeast genomes can now be studied, combined with the already numerous available reference genomes, offer privileged perspectives to further examine these fundamental biological questions using yeasts both as eukaryotic models and as fungi of practical importance.

mRNA expression dynamics promote and maintain the identity of somatic tissues in living organisms; however, their impact in post-transcriptional gene regulation in these processes is not fully understood. Here, we applied the PAT-Seq approach to systematically isolate, sequence, and map tissue-specific mRNA from five highly studied Caenorhabditis elegans somatic tissues: GABAergic and NMDA neurons, arcade and intestinal valve cells, seam cells, and hypodermal tissues, and studied their mRNA expression dynamics. The integration of these datasets with previously profiled transcriptomes of intestine, pharynx, and body muscle tissues, precisely assigns tissue-specific expression dynamics for 60% of all annotated C. elegans protein-coding genes, providing an important resource for the scientific community. The mapping of 15,956 unique high-quality tissue-specific polyA sites in all eight somatic tissues reveals extensive tissue-specific 3'untranslated region (3'UTR) isoform switching through alternative polyadenylation (APA) . Almost all ubiquitously transcribed genes use APA and harbor miRNA targets in their 3'UTRs, which are commonly lost in a tissue-specific manner, suggesting widespread usage of post-transcriptional gene regulation modulated through APA to fine tune tissue-specific protein expression. Within this pool, the human disease gene C. elegans orthologs rack-1 and tct-1 use APA to switch to shorter 3'UTR isoforms in order to evade miRNA regulation in the body muscle tissue, resulting in increased protein expression needed for proper body muscle function. Our results highlight a major positive regulatory role for APA, allowing genes to counteract miRNA regulation on a tissue-specific basis.

Efforts to map neural circuits have been galvanized by the development of genetic technologies that permit the manipulation of targeted sets of neurons in the brains of freely behaving animals. The success of these efforts relies on the experimenter’s ability to target arbitrarily small subsets of neurons for manipulation, but such specificity of targeting cannot routinely be achieved using existing methods. In Drosophila melanogaster, a widely-used technique for refined cell type-specific manipulation is the Split GAL4 system, which augments the targeting specificity of the binary GAL4-UAS (Upstream Activating Sequence) system by making GAL4 transcriptional activity contingent upon two enhancers, rather than one. To permit more refined targeting, we introduce here the "Killer Zipper" (KZip+), a suppressor that makes Split GAL4 targeting contingent upon a third enhancer. KZip+ acts by disrupting both the formation and activity of Split GAL4 heterodimers, and we show how this added layer of control can be used to selectively remove unwanted cells from a Split GAL4 expression pattern or to subtract neurons of interest from a pattern to determine their requirement in generating a given phenotype. To facilitate application of the KZip+ technology, we have developed a versatile set of LexAop-KZip+ fly lines that can be used directly with the large number of LexA driver lines with known expression patterns. KZip+ significantly sharpens the precision of neuronal genetic control available in Drosophila and may be extended to other organisms where Split GAL4-like systems are used.

In the yeast Saccharomyces cerevisiae, the genes encoding the metallothionein protein Cup1 are located in a tandem array on chromosome VIII. Using a diploid strain that is heterozygous for an insertion of a selectable marker (URA3) within this tandem array, and heterozygous for markers flanking the array, we measured interhomolog recombination and intra/sister chromatid exchange in the CUP1 locus. The rate of intra/sister chromatid recombination exceeded the rate of interhomolog recombination by >10-fold. Loss of the Rad51 and Rad52 proteins, required for most interhomolog recombination, led to a relatively small reduction of recombination in the CUP1 array. Although interhomolog mitotic recombination in the CUP1 locus is elevated relative to the average genomic region, we found that interhomolog meiotic recombination in the array is reduced compared to most regions. Lastly, we showed that high levels of copper (previously shown to elevate CUP1 transcription) lead to a substantial elevation in rate of both interhomolog and intra/sister chromatid recombination in the CUP1 array; recombination events that delete the URA3 insertion from the CUP1 array occur at a rate of >10–3/division in unselected cells. This rate is almost three orders of magnitude higher than observed for mitotic recombination events involving single-copy genes. In summary, our study shows that some of the basic properties of recombination differ considerably between single-copy and tandemly-repeated genes.

Meiotic homologous recombination, a critical event for ensuring faithful chromosome segregation and creating genetic diversity, is initiated by programmed DNA double-strand breaks (DSBs) formed at recombination hotspots. Meiotic DSB formation is likely to be influenced by other DNA-templated processes including transcription, but how DSB formation and transcription interact with each other has not been understood well. In this study, we used fission yeast to investigate a possible interplay of these two events. A group of hotspots in fission yeast are associated with sequences similar to the cyclic AMP response element and activated by the ATF/CREB family transcription factor dimer Atf1-Pcr1. We first focused on one of those hotspots, ade6-3049, and Atf1. Our results showed that multiple transcripts, shorter than the ade6 full-length messenger RNA, emanate from a region surrounding the ade6-3049 hotspot. Interestingly, we found that the previously known recombination-activation region of Atf1 is also a transactivation domain, whose deletion affected DSB formation and short transcript production at ade6-3049. These results point to a possibility that the two events may be related to each other at ade6-3049. In fact, comparison of published maps of meiotic transcripts and hotspots suggested that hotspots are very often located close to meiotically transcribed regions. These observations therefore propose that meiotic DSB formation in fission yeast may be connected to transcription of surrounding regions.

During cell division, aberrant DNA structures are detected by regulators called checkpoints that slow division to allow error correction. In addition to checkpoint-induced delay, it is widely assumed, though rarely shown, that merely slowing the cell cycle might allow more time for error detection and correction, thus resulting in a more stable genome. Fidelity by a slowed cell cycle might be independent of checkpoints. Here we tested the hypothesis that a slowed cell cycle stabilizes the genome, independent of checkpoints, in the budding yeast Saccharomyces cerevisiae. We were led to this hypothesis when we identified a gene (ERV14, an ER cargo membrane protein) that when mutated, unexpectedly stabilized the genome, as measured by three different chromosome assays. After extensive studies of pathways rendered dysfunctional in erv14 mutant cells, we are led to the inference that no particular pathway is involved in stabilization, but rather the slowed cell cycle induced by erv14 stabilized the genome. We then demonstrated that, in genetic mutations and chemical treatments unrelated to ERV14, a slowed cell cycle indeed correlates with a more stable genome, even in checkpoint-proficient cells. Data suggest a delay in G2/M may commonly stabilize the genome. We conclude that chromosome errors are more rarely made or are more readily corrected when the cell cycle is slowed (even ~15 min longer in an ~100-min cell cycle). And, some chromosome errors may not signal checkpoint-mediated responses, or do not sufficiently signal to allow correction, and their correction benefits from this "time checkpoint."

Lagging strand synthesis is mechanistically far more complicated than leading strand synthesis because it involves multistep processes and requires considerably more enzymes and protein factors. Due to this complexity, multiple fail-safe factors are required to ensure successful replication of the lagging strand DNA. We attempted to identify novel factors that are required in the absence of the helicase activity of Dna2, an essential enzyme in Okazaki-fragment maturation. In this article, we identified Rim11, a GSK-3β-kinase homolog, as a multicopy suppressor of dna2 helicase-dead mutant (dna2-K1080E). Subsequent epistasis analysis revealed that Ume6 (a DNA binding protein, a downstream substrate of Rim11) also acted as a multicopy suppressor of the dna2 allele. We found that the interaction of Ume6 with the conserved histone deacetylase complex Sin3-Rpd3 and the catalytic activity of Rpd3 were indispensable for the observed suppression of the dna2 mutant. Moreover, multicopy suppression by Rim11/Ume6 requires the presence of sister-chromatid recombination mediated by Rad52/Rad59 proteins, but not vice versa. Interestingly, the overexpression of Rim11 or Ume6 also suppressed the MMS sensitivity of rad59. We also showed that the lethality of dna2 helicase-dead mutant was attributed to checkpoint activation and that decreased levels of deoxynucleotide triphosphates (dNTPs) by overexpressing Sml1 (an inhibitor of ribonucleotide reductase) rescued the dna2 mutant. We also present evidence that indicates Rim11/Ume6 works independently but in parallel with that of checkpoint inhibition, dNTP regulation, and sister-chromatid recombination. In conclusion, our results establish Rim11, Ume6, the histone deacetylase complex Sin3-Rpd3 and Sml1 as new factors important in the events of faulty lagging strand synthesis.

The transcription factor SKN-1 (Skinhead family member-1) in Caenorhabditis elegans is a homolog of the mammalian Nrf-2 protein and functions to promote oxidative stress resistance and longevity. SKN-1 mediates protection from reactive oxygen species (ROS) via the transcriptional activation of genes involved in antioxidant defense and phase II detoxification. Although many core regulators of SKN-1 have been identified, much remains unknown about this complex signaling pathway. We carried out an ethyl methanesulfonate (EMS) mutagenesis screen and isolated six independent mutants with attenuated SKN-1-dependent gene activation in response to acrylamide. All six were found to contain mutations in F46F11.6/xrep-4 (xenobiotics response pathways-4), which encodes an uncharacterized F-box protein. Loss of xrep-4 inhibits the skn-1-dependent expression of detoxification genes in response to prooxidants and decreases survival of oxidative stress, but does not shorten life span under standard culture conditions. XREP-4 interacts with the ubiquitin ligase component SKR-1 and the SKN-1 principal repressor WDR-23, and knockdown of xrep-4 increases nuclear localization of a WDR-23::GFP fusion protein. Furthermore, a missense mutation in the conserved XREP-4 F-box domain that reduces interaction with SKR-1 but not WDR-23 strongly attenuates SKN-1-dependent gene activation. These results are consistent with XREP-4 influencing the SKN-1 stress response by functioning as a bridge between WDR-23 and the ubiquitin ligase component SKR-1.

The mechanisms that govern pattern formation within the cell are poorly understood. Ciliates carry on their surface an elaborate pattern of cortical organelles that are arranged along the anteroposterior and circumferential axes by largely unknown mechanisms. Ciliates divide by tandem duplication: the cortex of the predivision cell is remodeled into two similarly sized and complete daughters. In the conditional cdaI-1 mutant of Tetrahymena thermophila, the division plane migrates from its initially correct equatorial position toward the cell’s anterior, resulting in unequal cell division, and defects in nuclear divisions and cytokinesis. We used comparative whole genome sequencing to identify the cause of cdaI-1 as a mutation in a Hippo/Mst kinase. CdaI is a cortical protein with a cell cycle-dependent, highly polarized localization. Early in cell division, CdaI marks the anterior half of the cell, and later concentrates at the posterior end of the emerging anterior daughter. Despite the strong association of CdaI with the new posterior cell end, the cdaI-1 mutation does not affect the patterning of the new posterior cortical organelles. We conclude that, in Tetrahymena, the Hippo pathway maintains an equatorial position of the fission zone, and, by this activity, specifies the relative dimensions of the anterior and posterior daughter cell.

Resident gut bacteria are constantly influencing the immune system, yet the role of the immune system in shaping microbiota composition during an organism’s life span has remained unclear. Experiments in mice have been inconclusive due to differences in husbandry schemes that led to conflicting results. We used Drosophila as a genetically tractable system with a simpler gut bacterial population structure streamlined genetic backgrounds and established cross schemes to address this issue. We found that, depending on their genetic background, young flies had microbiota of different diversities that converged with age to the same Acetobacteraceae-dominated pattern in healthy flies. This pattern was accelerated in immune-compromised flies with higher bacterial load and gut cell death. Nevertheless, immune-compromised flies resembled their genetic background, indicating that familial transmission was the main force regulating gut microbiota. In contrast, flies with a constitutively active immune system had microbiota readily distinguishable from their genetic background with the introduction and establishment of previously undetectable bacterial families. This indicated the influence of immunity over familial transmission. Moreover, hyperactive immunity and increased enterocyte death resulted in the highest bacterial load observed starting from early adulthood. Cohousing experiments showed that the microenvironment also played an important role in the structure of the microbiota where flies with constitutive immunity defined the gut microbiota of their cohabitants. Our data show that, in Drosophila, constitutively active immunity shapes the structure and density of gut microbiota.

Notch signaling is an evolutionarily conserved pathway that is found to be involved in a number of cellular events throughout development. The deployment of the Notch signaling pathway in numerous cellular contexts is possible due to its regulation at multiple levels. In an effort to identify the novel components integrated into the molecular circuitry affecting Notch signaling, we carried out a protein–protein interaction screen based on the identification of cellular protein complexes using co-immunoprecipitation followed by mass-spectrometry. We identified Hrp48, a heterogeneous nuclear ribonucleoprotein in Drosophila, as a novel interacting partner of Deltex (Dx), a cytoplasmic modulator of Notch signaling. Immunocytochemical analysis revealed that Dx and Hrp48 colocalize in cytoplasmic vesicles. The dx mutant also showed strong genetic interactions with hrp48 mutant alleles. The coexpression of Dx and Hrp48 resulted in the depletion of cytoplasmic Notch in larval wing imaginal discs and downregulation of Notch targets cut and wingless. Previously, it has been shown that Sex-lethal (Sxl), on binding with Notch mRNA, negatively regulates Notch signaling. The overexpression of Hrp48 was found to inhibit Sxl expression and consequently rescued Notch signaling activity. In the present study, we observed that Dx together with Hrp48 can regulate Notch signaling in an Sxl-independent manner. In addition, Dx and Hrp48 displayed a synergistic effect on caspase-mediated cell death. Our results suggest that Dx and Hrp48 together negatively regulate Notch signaling in Drosophila melanogaster.

Age-based inheritance of centrosomes in eukaryotic cells is associated with faithful chromosome distribution in asymmetric cell divisions. During Saccharomyces cerevisiae ascospore formation, such an inheritance mechanism targets the yeast centrosome equivalents, the spindle pole bodies (SPBs) at meiosis II onset. Decreased nutrient availability causes initiation of spore formation at only the younger SPBs and their associated genomes. This mechanism ensures encapsulation of nonsister genomes, which preserves genetic diversity and provides a fitness advantage at the population level. Here, by usage of an enhanced system for sporulation-induced protein depletion, we demonstrate that the core mitotic exit network (MEN) is involved in age-based SPB selection. Moreover, efficient genome inheritance requires Dbf2/20-Mob1 during a late step in spore maturation. We provide evidence that the meiotic functions of the MEN are more complex than previously thought. In contrast to mitosis, completion of the meiotic divisions does not strictly rely on the MEN whereas its activity is required at different time points during spore development. This is reminiscent of vegetative MEN functions in spindle polarity establishment, mitotic exit, and cytokinesis. In summary, our investigation contributes to the understanding of age-based SPB inheritance during sporulation of S. cerevisiae and provides general insights on network plasticity in the context of a specialized developmental program. Moreover, the improved system for a developmental-specific tool to induce protein depletion will be useful in other biological contexts.

Oxidative damage contributes to human diseases of aging including diabetes, cancer, and cardiovascular disorders. Reactive oxygen species resulting from xenobiotic and endogenous metabolites are sensed by a poorly understood process, triggering a cascade of regulatory factors and leading to the activation of the transcription factor Nrf2 (Nuclear factor-erythroid-related factor 2, SKN-1 in Caenorhabditis elegans). Nrf2/SKN-1 activation promotes the induction of the phase II detoxification system that serves to limit oxidative stress. We have extended a previous C. elegans genetic approach to explore the mechanisms by which a phase II enzyme is induced by endogenous and exogenous oxidants. The xrep (xenobiotics response pathway) mutants were isolated as defective in their ability to properly regulate the induction of a glutathione S-transferase (GST) reporter. The xrep-1 gene was previously identified as wdr-23, which encodes a C. elegans homolog of the mammalian β-propeller repeat-containing protein WDR-23. Here, we identify and confirm the mutations in xrep-2, xrep-3, and xrep-4. The xrep-2 gene is alh-6, an ortholog of a human gene mutated in familial hyperprolinemia. The xrep-3 mutation is a gain-of-function allele of skn-1. The xrep-4 gene is F46F11.6, which encodes a F-box-containing protein. We demonstrate that xrep-4 alters the stability of WDR-23 (xrep-1), a key regulator of SKN-1 (xrep-3). Epistatic relationships among the xrep mutants and their interacting partners allow us to propose an ordered genetic pathway by which endogenous and exogenous stressors induce the phase II detoxification response.

Nutrients affect adult stem cells through complex mechanisms involving multiple organs. Adipocytes are highly sensitive to diet and have key metabolic roles, and obesity increases the risk for many cancers. How diet-regulated adipocyte metabolic pathways influence normal stem cell lineages, however, remains unclear. Drosophila melanogaster has highly conserved adipocyte metabolism and a well-characterized female germline stem cell (GSC) lineage response to diet. Here, we conducted an isobaric tags for relative and absolute quantification (iTRAQ) proteomic analysis to identify diet-regulated adipocyte metabolic pathways that control the female GSC lineage. On a rich (relative to poor) diet, adipocyte Hexokinase-C and metabolic enzymes involved in pyruvate/acetyl-CoA production are upregulated, promoting a shift of glucose metabolism toward macromolecule biosynthesis. Adipocyte-specific knockdown shows that these enzymes support early GSC progeny survival. Further, enzymes catalyzing fatty acid oxidation and phosphatidylethanolamine synthesis in adipocytes promote GSC maintenance, whereas lipid and iron transport from adipocytes controls vitellogenesis and GSC number, respectively. These results show a functional relationship between specific metabolic pathways in adipocytes and distinct processes in the GSC lineage, suggesting the adipocyte metabolism–stem cell link as an important area of investigation in other stem cell systems.

Elevated levels of human chitinase-like proteins (CLPs) are associated with numerous chronic inflammatory diseases and several cancers, often correlating with poor prognosis. Nevertheless, there is scant knowledge of their function. The CLPs normally mediate immune responses and wound healing and, when upregulated, they can promote disease progression by remodeling tissue, activating signaling cascades, stimulating proliferation and migration, and by regulating adhesion. We identified Imaginal disc growth factors (Idgfs), orthologs of human CLPs CHI3L1, CHI3L2, and OVGP1, in a proteomics analysis designed to discover factors that regulate tube morphogenesis in a Drosophila melanogaster model of tube formation. We implemented a novel approach that uses magnetic beads to isolate a small population of specialized ovarian cells, cells that nonautonomously regulate morphogenesis of epithelial tubes that form and secrete eggshell structures called dorsal appendages (DAs). Differential mass spectrometry analysis of these cells detected elevated levels of four of the six Idgf family members (Idgf1, Idgf2, Idgf4, and Idgf6) in flies mutant for bullwinkle (bwk), which encodes a transcription factor and is a known regulator of DA-tube morphogenesis. We show that, during oogenesis, dysregulation of Idgfs (either gain or loss of function) disrupts the formation of the DA tubes. Previous studies demonstrate roles for Drosophila Idgfs in innate immunity, wound healing, and cell proliferation and motility in cell culture. Here, we identify a novel role for Idgfs in both normal and aberrant tubulogenesis processes.

Drosophila dorsal closure is a morphogenetic movement that involves flanking epidermal cells, assembling actomyosin cables, and migrating dorsally over the underlying amnioserosa to seal at the dorsal midline. Echinoid (Ed)—a cell adhesion molecule of adherens junctions (AJs)—participates in several developmental processes. The disappearance of Ed from the amnioserosa is required to define the epidermal leading edge for actomyosin cable assembly and coordinated cell migration. However, the mechanism by which Ed is cleared from amnioserosa is unknown. Here, we show that Ed is cleared in amnioserosa by both transcriptional and post-translational mechanisms. First, Ed mRNA transcription was repressed in amnioserosa prior to the onset of dorsal closure. Second, the ubiquitin ligase Smurf downregulated pretranslated Ed by binding to the PPXY motif of Ed. During dorsal closure, Smurf colocalized with Ed at AJs, and Smurf overexpression prematurely degraded Ed in the amnioserosa. Conversely, Ed persisted in the amnioserosa of Smurf mutant embryos, which, in turn, affected actomyosin cable formation. Together, our results demonstrate that transcriptional repression of Ed followed by Smurf-mediated downregulation of pretranslated Ed in amnioserosa regulates the establishment of a taut leading edge during dorsal closure.

Many parthenogenetically reproducing animals produce offspring not clonally but through different mechanisms collectively referred to as automixis. Here, meiosis proceeds normally but is followed by a fusion of meiotic products that restores diploidy. This mechanism typically leads to a reduction in heterozygosity among the offspring compared to the mother. Following a derivation of the rate at which heterozygosity is lost at one and two loci, depending on the number of crossovers between loci and centromere, a number of models are developed to gain a better understanding of basic evolutionary processes in automictic populations. Analytical results are obtained for the expected neutral genetic variation, effective population size, mutation–selection balance, selection with overdominance, the spread of beneficial mutations, and selection on crossover rates. These results are complemented by numerical investigations elucidating how associative overdominance (two off-phase deleterious mutations at linked loci behaving like an overdominant locus) can in some cases maintain heterozygosity for prolonged times, and how clonal interference affects adaptation in automictic populations. These results suggest that although automictic populations are expected to suffer from the lack of gene shuffling with other individuals, they are nevertheless, in some respects, superior to both clonal and outbreeding sexual populations in the way they respond to beneficial and deleterious mutations. Implications for related genetic systems such as intratetrad mating, clonal reproduction, selfing, as well as different forms of mixed sexual and automictic reproduction are discussed.

The advent of next generation sequencing technologies has made whole-genome and whole-population sampling possible, even for eukaryotes with large genomes. With this development, experimental evolution studies can be designed to observe molecular evolution "in action" via evolve-and-resequence (E&R) experiments. Among other applications, E&R studies can be used to locate the genes and variants responsible for genetic adaptation. Most existing literature on time-series data analysis often assumes large population size, accurate allele frequency estimates, or wide time spans. These assumptions do not hold in many E&R studies. In this article, we propose a method—composition of likelihoods for evolve-and-resequence experiments (Clear)—to identify signatures of selection in small population E&R experiments. Clear takes whole-genome sequences of pools of individuals as input, and properly addresses heterogeneous ascertainment bias resulting from uneven coverage. Clear also provides unbiased estimates of model parameters, including population size, selection strength, and dominance, while being computationally efficient. Extensive simulations show that Clear achieves higher power in detecting and localizing selection over a wide range of parameters, and is robust to variation of coverage. We applied the Clear statistic to multiple E&R experiments, including data from a study of adaptation of Drosophila melanogaster to alternating temperatures and a study of outcrossing yeast populations, and identified multiple regions under selection with genome-wide significance.

Here, we develop and test a method to address whether DNA samples sequenced from a group of fossil hominin bone or tooth fragments originate from the same individual or from closely related individuals. Our method assumes low amounts of retrievable DNA, significant levels of sequencing error, and contamination from one or more present-day humans. We develop and implement a maximum likelihood method that estimates levels of contamination, sequencing error rates, and pairwise relatedness coefficients in a set of individuals. We assume that there is no reference panel for the ancient population to provide allele and haplotype frequencies. Our approach makes use of single nucleotide polymorphisms (SNPs) and does not make assumptions about the underlying demographic model. By artificially mating genomes from the 1000 Genomes Project, we determine the numbers of individuals at a given genomic coverage that are required to detect different levels of genetic relatedness with confidence.

Population demographic history may be learned from contemporary genetic variation data. Methods based on aggregating the statistics of many single loci into an allele frequency spectrum (AFS) have proven powerful, but such methods ignore potentially informative patterns of linkage disequilibrium (LD) between neighboring loci. To leverage such patterns, we developed a composite-likelihood framework for inferring demographic history from aggregated statistics of pairs of loci. Using this framework, we show that two-locus statistics are more sensitive to demographic history than single-locus statistics such as the AFS. In particular, two-locus statistics escape the notorious confounding of depth and duration of a bottleneck, and they provide a means to estimate effective population size based on the recombination rather than mutation rate. We applied our approach to a Zambian population of Drosophila melanogaster. Notably, using both single- and two-locus statistics, we inferred a substantially lower ancestral effective population size than previous works and did not infer a bottleneck history. Together, our results demonstrate the broad potential for two-locus statistics to enable powerful population genetic inference.

Fisher’s geometric model was originally introduced to argue that complex adaptations must occur in small steps because of pleiotropic constraints. When supplemented with the assumption of additivity of mutational effects on phenotypic traits, it provides a simple mechanism for the emergence of genotypic epistasis from the nonlinear mapping of phenotypes to fitness. Of particular interest is the occurrence of reciprocal sign epistasis, which is a necessary condition for multipeaked genotypic fitness landscapes. Here we compute the probability that a pair of randomly chosen mutations interacts sign epistatically, which is found to decrease with increasing phenotypic dimension n, and varies nonmonotonically with the distance from the phenotypic optimum. We then derive expressions for the mean number of fitness maxima in genotypic landscapes comprised of all combinations of L random mutations. This number increases exponentially with L, and the corresponding growth rate is used as a measure of the complexity of the landscape. The dependence of the complexity on the model parameters is found to be surprisingly rich, and three distinct phases characterized by different landscape structures are identified. Our analysis shows that the phenotypic dimension, which is often referred to as phenotypic complexity, does not generally correlate with the complexity of fitness landscapes and that even organisms with a single phenotypic trait can have complex landscapes. Our results further inform the interpretation of experiments where the parameters of Fisher’s model have been inferred from data, and help to elucidate which features of empirical fitness landscapes can be described by this model.

Digital imagery can help to quantify seasonal changes in desirable crop phenotypes that can be treated as quantitative traits. Because limitations in precise and functional phenotyping restrain genetic improvement in the postgenomic era, imagery-based phenomics could become the next breakthrough to accelerate genetic gains in field crops. Whereas many phenomic studies focus on exploratory analysis of spectral data without obvious interpretative value, we used field images to directly measure soybean canopy development from phenological stage V2 to R5. Over 3 years, we collected imagery using ground and aerial platforms of a large and diverse nested association panel comprising 5555 lines. Genome-wide association analysis of canopy coverage across sampling dates detected a large quantitative trait locus (QTL) on soybean (Glycine max, L. Merr.) chromosome 19. This QTL provided an increase in yield of 47.3 kg ha–1. Variance component analysis indicated that a parameter, described as average canopy coverage, is a highly heritable trait (h2 = 0.77) with a promising genetic correlation with grain yield (0.87), enabling indirect selection of yield via canopy development parameters. Our findings indicate that fast canopy coverage is an early season trait that is inexpensive to measure and has great potential for application in breeding programs focused on yield improvement. We recommend using the average canopy coverage in multiple trait schemes, especially for the early stages of the breeding pipeline (including progeny rows and preliminary yield trials), in which the large number of field plots makes collection of grain yield data challenging.

How sex is determined in insects is diverse and dynamic, and includes male heterogamety, female heterogamety, and haplodiploidy. In many insect lineages, sex determination is either completely unknown or poorly studied. We studied sex determination in Psocodea—a species-rich order of insects that includes parasitic lice, barklice, and booklice. We focus on a recently discovered species of Liposcelis booklice (Psocodea: Troctomorpha), which are among the closest free-living relatives of parasitic lice. Using genetic, genomic, and immunohistochemical approaches, we show that this group exhibits paternal genome elimination (PGE), an unusual mode of sex determination that involves genomic imprinting. Controlled crosses, following a genetic marker over multiple generations, demonstrated that males only transmit to offspring genes they inherited from their mother. Immunofluorescence microscopy revealed densely packed chromocenters associated with H3K9me3—a conserved marker for heterochromatin—in males, but not in females, suggesting silencing of chromosomes in males. Genome assembly and comparison of read coverage in male and female libraries showed no evidence for differentiated sex chromosomes. We also found that females produce more sons early in life, consistent with facultative sex allocation. It is likely that PGE is widespread in Psocodea, including human lice. This order represents a promising model for studying this enigmatic mode of sex determination.

The genetic architecture of behavioral traits in dogs is of great interest to owners, breeders, and professionals involved in animal welfare, as well as to scientists studying the genetics of animal (including human) behavior. The genetic component of dog behavior is supported by between-breed differences and some evidence of within-breed variation. However, it is a challenge to gather sufficiently large datasets to dissect the genetic basis of complex traits such as behavior, which are both time-consuming and logistically difficult to measure, and known to be influenced by nongenetic factors. In this study, we exploited the knowledge that owners have of their dogs to generate a large dataset of personality traits in Labrador Retrievers. While accounting for key environmental factors, we demonstrate that genetic variance can be detected for dog personality traits assessed using questionnaire data. We identified substantial genetic variance for several traits, including fetching tendency and fear of loud noises, while other traits revealed negligibly small heritabilities. Genetic correlations were also estimated between traits; however, due to fairly large SEs, only a handful of trait pairs yielded statistically significant estimates. Genomic analyses indicated that these traits are mainly polygenic, such that individual genomic regions have small effects, and suggested chromosomal associations for six of the traits. The polygenic nature of these traits is consistent with previous behavioral genetics studies in other species, for example in mouse, and confirms that large datasets are required to quantify the genetic variance and to identify the individual genes that influence behavioral traits.

Genetic association studies in admixed populations are underrepresented in the genomics literature, with a key concern for researchers being the adequate control of spurious associations due to population structure. Linear mixed models (LMMs) are well suited for genome-wide association studies (GWAS) because they account for both population stratification and cryptic relatedness and achieve increased statistical power by jointly modeling all genotyped markers. Additionally, Bayesian LMMs allow for more flexible assumptions about the underlying distribution of genetic effects, and can concurrently estimate the proportion of phenotypic variance explained by genetic markers. Using three recently published Bayesian LMMs, Bayes R, BSLMM, and BOLT-LMM, we investigate an existing data set on eye (n = 625) and skin (n = 684) color from Cape Verde, an island nation off West Africa that is home to individuals with a broad range of phenotypic values for eye and skin color due to the mix of West African and European ancestry. We use simulations to demonstrate the utility of Bayesian LMMs for mapping loci and studying the genetic architecture of quantitative traits in admixed populations. The Bayesian LMMs provide evidence for two new pigmentation loci: one for eye color (AHRR) and one for skin color (DDB1).

Long-term genomic selection (GS) requires strategies that balance genetic gain with population diversity, to sustain progress for traits under selection, and to keep diversity for future breeding. In a simulation model for a recurrent selection scheme, we provide the first head-to-head comparison of two such existing strategies: genomic optimal contributions selection (GOCS), which limits realized genomic relationship among selection candidates, and weighted genomic selection (WGS), which upscales rare allele effects in GS. Compared to GS, both methods provide the same higher long-term genetic gain and a similar lower inbreeding rate, despite some inherent limitations. GOCS does not control the inbreeding rate component linked to trait selection, and, therefore, does not strike the optimal balance between genetic gain and inbreeding. This makes it less effective throughout the breeding scheme, and particularly so at the beginning, where genetic gain and diversity may not be competing. For WGS, truncation selection proved suboptimal to manage rare allele frequencies among the selection candidates. To overcome these limitations, we introduce two new set selection methods that maximize a weighted index balancing genetic gain with controlling expected heterozygosity (IND-HE) or maintaining rare alleles (IND-RA), and show that these outperform GOCS and WGS in a nearly identical way. While requiring further testing, we believe that the inherent benefits of the IND-HE and IND-RA methods will transfer from our simulation framework to many practical breeding settings, and are therefore a major step forward toward efficient long-term genomic selection.

Crescentic glomerulonephritis (Crgn) is a complex disorder where macrophage activity and infiltration are significant effector causes. In previous linkage studies using the uniquely susceptible Wistar Kyoto (WKY) rat strain, we have identified multiple crescentic glomerulonephritis QTL (Crgn) and positionally cloned genes underlying Crgn1 and Crgn2, which accounted for 40% of total variance in glomerular inflammation. Here, we have generated a backcross (BC) population (n = 166) where Crgn1 and Crgn2 were genetically fixed and found significant linkage to glomerular crescents on chromosome 2 (Crgn8, LOD = 3.8). Fine mapping analysis by integration with genome-wide expression QTLs (eQTLs) from the same BC population identified ceruloplasmin (Cp) as a positional eQTL in macrophages but not in serum. Liquid chromatography-tandem mass spectrometry confirmed Cp as a protein QTL in rat macrophages. WKY macrophages overexpress Cp and its downregulation by RNA interference decreases markers of glomerular proinflammatory macrophage activation. Similarly, short incubation with Cp results in a strain-dependent macrophage polarization in the rat. These results suggest that genetically determined Cp levels can alter susceptibility to Crgn through macrophage function and propose a new role for Cp in early macrophage activation.

Yeast flocculation is a community-building cell aggregation trait that is an important mechanism of stress resistance and a useful phenotype for brewers; however, it is also a nuisance in many industrial processes, in clinical settings, and in the laboratory. Chemostat-based evolution experiments are impaired by inadvertent selection for aggregation, which we observe in 35% of populations. These populations provide a testing ground for understanding the breadth of genetic mechanisms Saccharomyces cerevisiae uses to flocculate, and which of those mechanisms provide the biggest adaptive advantages. In this study, we employed experimental evolution as a tool to ask whether one or many routes to flocculation are favored, and to engineer a strain with reduced flocculation potential. Using a combination of whole genome sequencing and bulk segregant analysis, we identified causal mutations in 23 independent clones that had evolved cell aggregation during hundreds of generations of chemostat growth. In 12 of those clones, we identified a transposable element insertion in the promoter region of known flocculation gene FLO1, and, in an additional five clones, we recovered loss-of-function mutations in transcriptional repressor TUP1, which regulates FLO1 and other related genes. Other causal mutations were found in genes that have not been previously connected to flocculation. Evolving a flo1 deletion strain revealed that this single deletion reduces flocculation occurrences to 3%, and demonstrated the efficacy of using experimental evolution as a tool to identify and eliminate the primary adaptive routes for undesirable traits.



Genetic Diseases

When medical researchers want to investigate serious genetic diseases, they have to find ways to locate the corresponding risk genes. There are relatively few of these risk genes out of the 100,000 genes in the human cell, so it obviously is not an easy task...
Read More



A hereditary unit that occupies a certain position on a chromosome; a unit that has one or more specific effects on the phenotype, and can mutate to various allelic forms.