Using gene map science to evaluate the genetic map and eliminate disease

Genetic News

The Thomas Hunt Morgan Medal recognizes lifetime contributions to the field of genetics. The 2020 recipient is David Botstein of Calico Labs and Princeton University, recognizing his multiple contributions to genetics, including the collaborative development of methods for defining genetic pathways, mapping genomes, and analyzing gene expression.

The Elizabeth W. Jones Award for Excellence in Education recognizes an individual who has had a significant impact on genetics education at any education level. Seth R. Bordenstein, Ph.D., Centennial Professor of Biological Sciences at Vanderbilt University and Founding Director of the Vanderbilt Microbiome Initiative, is the 2020 recipient in recognition of his cofounding, developing, and expanding Discover the Microbes Within! The Wolbachia Project.

The control of body and organ growth is essential for the development of adults with proper size and proportions, which is important for survival and reproduction. In animals, adult body size is determined by the rate and duration of juvenile growth, which are influenced by the environment. In nutrient-scarce environments in which more time is needed for growth, the juvenile growth period can be extended by delaying maturation, whereas juvenile development is rapidly completed in nutrient-rich conditions. This flexibility requires the integration of environmental cues with developmental signals that govern internal checkpoints to ensure that maturation does not begin until sufficient tissue growth has occurred to reach a proper adult size. The Target of Rapamycin (TOR) pathway is the primary cell-autonomous nutrient sensor, while circulating hormones such as steroids and insulin-like growth factors are the main systemic regulators of growth and maturation in animals. We discuss recent findings in Drosophila melanogaster showing that cell-autonomous environment and growth-sensing mechanisms, involving TOR and other growth-regulatory pathways, that converge on insulin and steroid relay centers are responsible for adjusting systemic growth, and development, in response to external and internal conditions. In addition to this, proper organ growth is also monitored and coordinated with whole-body growth and the timing of maturation through modulation of steroid signaling. This coordination involves interorgan communication mediated by Drosophila insulin-like peptide 8 in response to tissue growth status. Together, these multiple nutritional and developmental cues feed into neuroendocrine hubs controlling insulin and steroid signaling, serving as checkpoints at which developmental progression toward maturation can be delayed. This review focuses on these mechanisms by which external and internal conditions can modulate developmental growth and ensure proper adult body size, and highlights the conserved architecture of this system, which has made Drosophila a prime model for understanding the coordination of growth and maturation in animals.

Caenorhabditis elegans’ behavioral states, like those of other animals, are shaped by its immediate environment, its past experiences, and by internal factors. We here review the literature on C. elegans behavioral states and their regulation. We discuss dwelling and roaming, local and global search, mate finding, sleep, and the interaction between internal metabolic states and behavior.

Recent work by Kentaro Ohkuni and colleagues exemplifies how a series of molecular mechanisms contribute to a cellular outcome—equal distribution of chromosomes. Failure to maintain structural and numerical integrity of chromosomes is one contributing factor in genetic diseases such as cancer. Specifically, the authors investigated molecular events surrounding centromeric histone H3 variant Cse4 deposition—a process important for chromosome segregation, using Saccharomyces cerevisiae as a model organism. This study illustrates an example of a post-translational modification—sumoylation—regulating a cellular process and the concept of genetic interactions (e.g., synthetic dosage lethality). Furthermore, the study highlights the importance of using diverse experimental approaches in answering a few key research questions. The authors used molecular biology techniques (e.g., qPCR), biochemical experiments (e.g., Ni-NTA/8His protein purification), as well as genetic approaches to understand the regulation of Cse4. At a big-picture level, the study reveals how genetic changes can lead to subsequent molecular and cellular changes.

With the widespread use of single nucleotide variants generated through mutagenesis screens and genome editing technologies, there is pressing need for an efficient and low-cost strategy to genotype single nucleotide substitutions. We have developed a rapid and inexpensive method for detection of point mutants through optimization of SuperSelective (SS) primers for end-point PCR in Caenorhabditis elegans. Each SS primer consists of a 5' "anchor" that hybridizes to the template, followed by a noncomplementary "bridge," and a "foot" corresponding to the target allele. The foot sequence is short, such that a single mismatch at the terminal 3' nucleotide destabilizes primer binding and prevents extension, enabling discrimination of different alleles. We explored how length and sequence composition of each SS primer segment affected selectivity and efficiency in various genetic contexts in order to develop simple rules for primer design that allow for differentiation between alleles over a broad range of annealing temperatures. Manipulating bridge length affected amplification efficiency, while modifying the foot sequence altered discriminatory power. Changing the anchor position enabled SS primers to be used for genotyping in regions with sequences that are challenging for standard primer design. After defining primer design parameters, we demonstrated the utility of SS primers for genotyping crude C. elegans lysates, suggesting that this approach could also be used for SNP mapping and screening of CRISPR mutants. Further, since SS primers reliably detect point mutations, this method has potential for broad application in all genetic systems.

Sequence analysis frequently requires intuitive understanding and convenient representation of motifs. Typically, motifs are represented as position weight matrices (PWMs) and visualized using sequence logos. However, in many scenarios, in order to interpret the motif information or search for motif matches, it is compact and sufficient to represent motifs by wildcard-style consensus sequences (such as [GC][AT]GATAAG[GAC]). Based on mutual information theory and Jensen-Shannon divergence, we propose a mathematical framework to minimize the information loss in converting PWMs to consensus sequences. We name this representation as sequence Motto and have implemented an efficient algorithm with flexible options for converting motif PWMs into Motto from nucleotides, amino acids, and customized characters. We show that this representation provides a simple and efficient way to identify the binding sites of 1156 common transcription factors (TFs) in the human genome. The effectiveness of the method was benchmarked by comparing sequence matches found by Motto with PWM scanning results found by FIMO. On average, our method achieves a 0.81 area under the precision-recall curve, significantly (P-value < 0.01) outperforming all existing methods, including maximal positional weight, Cavener’s method, and minimal mean square error. We believe this representation provides a distilled summary of a motif, as well as the statistical justification.

Meiosis is regulated in a sex-specific manner to produce two distinct gametes, sperm and oocytes, for sexual reproduction. To determine how meiotic recombination is regulated in spermatogenesis, we analyzed the meiotic phenotypes of mutants in the tumor suppressor E3 ubiquitin ligase BRC-1-BRD-1 complex in Caenorhabditis elegans male meiosis. Unlike in mammals, this complex is not required for meiotic sex chromosome inactivation, the process whereby hemizygous sex chromosomes are transcriptionally silenced. Interestingly, brc-1 and brd-1 mutants show meiotic recombination phenotypes that are largely opposing to those previously reported for female meiosis. Fewer meiotic recombination intermediates marked by the recombinase RAD-51 were observed in brc-1 and brd-1 mutants, and the reduction in RAD-51 foci could be suppressed by mutation of nonhomologous-end-joining proteins. Analysis of GFP::RPA-1 revealed fewer foci in the brc-1 brd-1 mutant and concentration of BRC-1-BRD-1 to sites of meiotic recombination was dependent on DNA end resection, suggesting that the complex regulates the processing of meiotic double-strand breaks to promote repair by homologous recombination. Further, BRC-1-BRD-1 is important to promote progeny viability when male meiosis is perturbed by mutations that block the pairing and synapsis of different chromosome pairs, although the complex is not required to stabilize the RAD-51 filament as in female meiosis under the same conditions. Analyses of crossover designation and formation revealed that BRC-1-BRD-1 inhibits supernumerary COs when meiosis is perturbed. Together, our findings suggest that BRC-1-BRD-1 regulates different aspects of meiotic recombination in male and female meiosis.

RecA is essential for double-strand-break repair (DSBR) and the SOS response in Escherichia coli K-12. RecN is an SOS protein and a member of the Structural Maintenance of Chromosomes family of proteins thought to play a role in sister chromatid cohesion/interactions during DSBR. Previous studies have shown that a plasmid-encoded recA4190 (Q300R) mutant had a phenotype similar to recN (mitomycin C sensitive and UV resistant). It was hypothesized that RecN and RecA physically interact, and that recA4190 specifically eliminated this interaction. To test this model, an epistasis analysis between recA4190 and recN was performed in wild-type and recBC sbcBC cells. To do this, recA4190 was first transferred to the chromosome. As single mutants, recA4190 and recN were Rec+ as measured by transductional recombination, but were 3-fold and 10-fold decreased in their ability to do I-SceI-induced DSBR, respectively. In both cases, the double mutant had an additive phenotype relative to either single mutant. In the recBC sbcBC background, recA4190 and recN cells were very UVS (sensitive), Rec, had high basal levels of SOS expression and an altered distribution of RecA-GFP structures. In all cases, the double mutant had additive phenotypes. These data suggest that recA4190 (Q300R) and recN remove functions in genetically distinct pathways important for DNA repair, and that RecA Q300 was not important for an interaction between RecN and RecA in vivo. recA4190 (Q300R) revealed modest phenotypes in a wild-type background and dramatic phenotypes in a recBC sbcBC strain, reflecting greater stringency of RecA’s role in that background.

In meiosis, crossover (CO) formation between homologous chromosomes is essential for faithful segregation. However, misplaced meiotic recombination can have catastrophic consequences on genome stability. Within pericentromeres, COs are associated with meiotic chromosome missegregation. In organisms ranging from yeast to humans, pericentromeric COs are repressed. We previously identified a role for the kinetochore-associated Ctf19 complex (Ctf19c) in pericentromeric CO suppression. Here, we develop a dCas9/CRISPR-based system that allows ectopic targeting of Ctf19c-subunits. Using this approach, we query sufficiency in meiotic CO suppression, and identify Ctf19 as a mediator of kinetochore-associated CO control. The effect of Ctf19 is encoded in its NH2-terminal tail, and depends on residues important for the recruitment of the Scc2-Scc4 cohesin regulator. This work provides insight into kinetochore-derived control of meiotic recombination. We establish an experimental platform to investigate and manipulate meiotic CO control. This platform can easily be adapted in order to investigate other aspects of chromosome biology.

An unusual feature of the opportunistic pathogen Candida albicans is its ability to switch stochastically between two distinct, heritable cell types called white and opaque. Here, we show that only opaque cells, in response to environmental signals, massively upregulate a specific group of secreted proteases and peptide transporters, allowing exceptionally efficient use of proteins as sources of nitrogen. We identify the specific proteases [members of the secreted aspartyl protease (SAP) family] needed for opaque cells to proliferate under these conditions, and we identify four transcriptional regulators of this specialized proteolysis and uptake program. We also show that, in mixed cultures, opaque cells enable white cells to also proliferate efficiently when proteins are the sole nitrogen source. Based on these observations, we suggest that one role of white-opaque switching is to create mixed populations where the different phenotypes derived from a single genome are shared between two distinct cell types.

Active transport of organelles within axons is critical for neuronal health. Retrograde axonal transport, in particular, relays neurotrophic signals received by axon terminals to the nucleus and circulates new material among en passant synapses. A single motor protein complex, cytoplasmic dynein, is responsible for nearly all retrograde transport within axons: its linkage to and transport of diverse cargos is achieved by cargo-specific regulators. Here, we identify Vezatin as a conserved regulator of retrograde axonal transport. Vertebrate Vezatin (Vezt) is required for the maturation and maintenance of cell-cell junctions and has not previously been implicated in axonal transport. However, a related fungal protein, VezA, has been shown to regulate retrograde transport of endosomes in hyphae. In a forward genetic screen, we identified a loss-of-function mutation in the Drosophila vezatin-like (vezl) gene. We here show that vezl loss prevents a subset of endosomes, including signaling endosomes containing activated BMP receptors, from initiating transport out of motor neuron terminal boutons. vezl loss also decreases the transport of endosomes and dense core vesicles, but not mitochondria, within axon shafts. We disrupted vezt in zebrafish and found that vezt loss specifically impairs the retrograde axonal transport of late endosomes, causing their accumulation in axon terminals. Our work establishes a conserved, cargo-specific role for Vezatin proteins in retrograde axonal transport.

Temporally spaced genetic data allow for more accurate inference of population genetic parameters and hypothesis testing on the recent action of natural selection. In this work, we develop a novel likelihood-based method for jointly estimating selection coefficient and allele age from time series data of allele frequencies. Our approach is based on a hidden Markov model where the underlying process is a Wright-Fisher diffusion conditioned to survive until the time of the most recent sample. This formulation circumvents the assumption required in existing methods that the allele is created by mutation at a certain low frequency. We calculate the likelihood by numerically solving the resulting Kolmogorov backward equation backward in time while reweighting the solution with the emission probabilities of the observation at each sampling time point. This procedure reduces the two-dimensional numerical search for the maximum of the likelihood surface, for both the selection coefficient and the allele age, to a one-dimensional search over the selection coefficient only. We illustrate through extensive simulations that our method can produce accurate estimates of the selection coefficient and the allele age under both constant and nonconstant demographic histories. We apply our approach to reanalyze ancient DNA data associated with horse base coat colors. We find that ignoring demographic histories or grouping raw samples can significantly bias the inference results.

Aspergillus fumigatus is a major human pathogen. In contrast, Aspergillus fischeri and the recently described Aspergillus oerlinghausenensis, the two species most closely related to A. fumigatus, are not known to be pathogenic. Some of the genetic determinants of virulence (or "cards of virulence") that A. fumigatus possesses are secondary metabolites that impair the host immune system, protect from host immune cell attacks, or acquire key nutrients. To examine whether secondary metabolism-associated cards of virulence vary between these species, we conducted extensive genomic and secondary metabolite profiling analyses of multiple A. fumigatus, one A. oerlinghausenensis, and multiple A. fischeri strains. We identified two cards of virulence (gliotoxin and fumitremorgin) shared by all three species and three cards of virulence (trypacidin, pseurotin, and fumagillin) that are variable. For example, we found that all species and strains examined biosynthesized gliotoxin, which is known to contribute to virulence, consistent with the conservation of the gliotoxin biosynthetic gene cluster (BGC) across genomes. For other secondary metabolites, such as fumitremorgin, a modulator of host biology, we found that all species produced the metabolite but that there was strain heterogeneity in its production within species. Finally, species differed in their biosynthesis of fumagillin and pseurotin, both contributors to host tissue damage during invasive aspergillosis. A. fumigatus biosynthesized fumagillin and pseurotin, while A. oerlinghausenensis biosynthesized fumagillin and A. fischeri biosynthesized neither. These biochemical differences were reflected in sequence divergence of the intertwined fumagillin/pseurotin BGCs across genomes. These results delineate the similarities and differences in secondary metabolism-associated cards of virulence between a major fungal pathogen and its nonpathogenic closest relatives, shedding light onto the genetic and phenotypic changes associated with the evolution of fungal pathogenicity.

It is increasingly evident that natural selection plays a prominent role in shaping patterns of diversity across the genome. The most commonly studied modes of natural selection are positive selection and negative selection, which refer to directional selection for and against derived mutations, respectively. Positive selection can result in hitchhiking events, in which a beneficial allele rapidly replaces all others in the population, creating a valley of diversity around the selected site along with characteristic skews in allele frequencies and linkage disequilibrium among linked neutral polymorphisms. Similarly, negative selection reduces variation not only at selected sites but also at linked sites, a phenomenon called background selection (BGS). Thus, discriminating between these two forces may be difficult, and one might expect efforts to detect hitchhiking to produce an excess of false positives in regions affected by BGS. Here, we examine the similarity between BGS and hitchhiking models via simulation. First, we show that BGS may somewhat resemble hitchhiking in simplistic scenarios in which a region constrained by negative selection is flanked by large stretches of unconstrained sites, echoing previous results. However, this scenario does not mirror the actual spatial arrangement of selected sites across the genome. By performing forward simulations under more realistic scenarios of BGS, modeling the locations of protein-coding and conserved noncoding DNA in real genomes, we show that the spatial patterns of variation produced by BGS rarely mimic those of hitchhiking events. Indeed, BGS is not substantially more likely than neutrality to produce false signatures of hitchhiking. This holds for simulations modeled after both humans and Drosophila, and for several different demographic histories. These results demonstrate that appropriately designed scans for hitchhiking need not consider BGS’s impact on false-positive rates. However, we do find evidence that BGS increases the false-negative rate for hitchhiking, an observation that demands further investigation.

Recent advances in DNA sequencing techniques have made it possible to monitor genomes in great detail over time. This improvement provides an opportunity for us to study natural selection based on time serial samples of genomes while accounting for genetic recombination effect and local linkage information. Such time series genomic data allow for more accurate estimation of population genetic parameters and hypothesis testing on the recent action of natural selection. In this work, we develop a novel Bayesian statistical framework for inferring natural selection at a pair of linked loci by capitalising on the temporal aspect of DNA data with the additional flexibility of modeling the sampled chromosomes that contain unknown alleles. Our approach is built on a hidden Markov model where the underlying process is a two-locus Wright-Fisher diffusion with selection, which enables us to explicitly model genetic recombination and local linkage. The posterior probability distribution for selection coefficients is computed by applying the particle marginal Metropolis-Hastings algorithm, which allows us to efficiently calculate the likelihood. We evaluate the performance of our Bayesian inference procedure through extensive simulations, showing that our approach can deliver accurate estimates of selection coefficients, and the addition of genetic recombination and local linkage brings about significant improvement in the inference of natural selection. We also illustrate the utility of our method on real data with an application to ancient DNA data associated with white spotting patterns in horses.

Tracing evolutionary processes that lead to fixation of genomic variation in wild bacterial populations is a prime challenge in molecular evolution. In particular, the relative contribution of horizontal gene transfer (HGT) vs. de novo mutations during adaptation to a new environment is poorly understood. To gain a better understanding of the dynamics of HGT and its effect on adaptation, we subjected several populations of competent Bacillus subtilis to a serial dilution evolution on a high-salt-containing medium, either with or without foreign DNA from diverse pre-adapted or naturally salt tolerant species. Following 504 generations of evolution, all populations improved growth yield on the medium. Sequencing of evolved populations revealed extensive acquisition of foreign DNA from close Bacillus donors but not from more remote donors. HGT occurred in bursts, whereby a single bacterial cell appears to have acquired dozens of fragments at once. In the largest burst, close to 2% of the genome has been replaced by HGT. Acquired segments tend to be clustered in integration hotspots. Other than HGT, genomes also acquired spontaneous mutations. Many of these mutations occurred within, and seem to alter, the sequence of flagellar proteins. Finally, we show that, while some HGT fragments could be neutral, others are adaptive and accelerate evolution.

Genetic drift is an important evolutionary force of strength inversely proportional to Ne, the effective population size. The impact of drift on genome diversity and evolution is known to vary among species, but quantifying this effect is a difficult task. Here we assess the magnitude of variation in drift power among species of animals via its effect on the mutation load – which implies also inferring the distribution of fitness effects of deleterious mutations. To this aim, we analyze the nonsynonymous (amino-acid changing) and synonymous (amino-acid conservative) allele frequency spectra in a large sample of metazoan species, with a focus on the primates vs. fruit flies contrast. We show that a Gamma model of the distribution of fitness effects is not suitable due to strong differences in estimated shape parameters among taxa, while adding a class of lethal mutations essentially solves the problem. Using the Gamma + lethal model and assuming that the mean deleterious effects of nonsynonymous mutations is shared among species, we estimate that the power of drift varies by a factor of at least 500 between large-Ne and small-Ne species of animals, i.e., an order of magnitude more than the among-species variation in genetic diversity. Our results are relevant to Lewontin’s paradox while further questioning the meaning of the Ne parameter in population genomics.

We investigate the evolutionary rescue of a microbial population in a gradually deteriorating environment, through a combination of analytical calculations and stochastic simulations. We consider a population destined for extinction in the absence of mutants, which can survive only if mutants sufficiently adapted to the new environment arise and fix. We show that mutants that appear later during the environment deterioration have a higher probability to fix. The rescue probability of the population increases with a sigmoidal shape when the product of the carrying capacity and of the mutation probability increases. Furthermore, we find that rescue becomes more likely for smaller population sizes and/or mutation probabilities if the environment degradation is slower, which illustrates the key impact of the rapidity of environment degradation on the fate of a population. We also show that our main conclusions are robust across various types of adaptive mutants, including specialist and generalist ones, as well as mutants modeling antimicrobial resistance evolution. We further express the average time of appearance of the mutants that do rescue the population and the average extinction time of those that do not. Our methods can be applied to other situations with continuously variable fitnesses and population sizes, and our analytical predictions are valid in the weak-to-moderate mutation regime.

Hybrid male sterility (HMS) contributes to reproductive isolation commonly observed among house mouse (Mus musculus) subspecies, both in the wild and in laboratory crosses. Incompatibilities involving specific Prdm9 alleles and certain Chromosome (Chr) X genotypes are known determinants of fertility and HMS, and previous work in the field has demonstrated that genetic background modifies these two major loci. We constructed hybrids that have identical genotypes at Prdm9 and identical X chromosomes, but differ widely across the rest of the genome. In each case, we crossed female PWK/PhJ mice representative of the M. m. musculus subspecies to males from a classical inbred strain representative of M. m. domesticus: 129S1/SvImJ, A/J, C57BL/6J, or DBA/2J. We detected three distinct trajectories of fertility among the hybrids using breeding experiments. The PWK129S1 males were always infertile. PWKDBA2 males were fertile, despite their genotypes at the major HMS loci. We also observed age-dependent changes in fertility parameters across multiple genetic backgrounds. The PWKB6 and PWKAJ males were always infertile before 12 weeks and after 35 weeks. However, some PWKB6 and PWKAJ males were transiently fertile between 12 and 35 weeks. This observation could resolve previous contradictory reports about the fertility of PWKB6. Taken together, these results point to multiple segregating HMS modifier alleles, some of which have age-related modes of action. The ultimate identification of these alleles and their age-related mechanisms will advance understanding both of the genetic architecture of HMS and of how reproductive barriers are maintained between house mouse subspecies.

Bread wheat (Triticum aestivum) is a major food crop and an important plant system for agricultural genetics research. However, due to the complexity and size of its allohexaploid genome, genomic resources are limited compared to other major crops. The IWGSC recently published a reference genome and associated annotation (IWGSC CS v1.0, Chinese Spring) that has been widely adopted and utilized by the wheat community. Although this reference assembly represents all three wheat subgenomes at chromosome-scale, it was derived from short reads, and thus is missing a substantial portion of the expected 16 Gbp of genomic sequence. We earlier published an independent wheat assembly (Triticum_aestivum_3.1, Chinese Spring) that came much closer in length to the expected genome size, although it was only a contig-level assembly lacking gene annotations. Here, we describe a reference-guided effort to scaffold those contigs into chromosome-length pseudomolecules, add in any missing sequence that was unique to the IWGSC CS v1.0 assembly, and annotate the resulting pseudomolecules with genes. Our updated assembly, Triticum_aestivum_4.0, contains 15.07 Gbp of nongap sequence anchored to chromosomes, which is 1.2 Gbps more than the previous reference assembly. It includes 108,639 genes unambiguously localized to chromosomes, including over 2000 genes that were previously unplaced. We also discovered >5700 additional gene copies, facilitating the accurate annotation of functional gene duplications including at the Ppd-B1 photoperiod response locus.



Genetic Benefits

The techniques developed for genetic mapping have had great impact on the life sciences, and particularly in medicine. But genetic mapping technologies also have useful applications in other fields...
Read More



Genetic map companies like 23andme and DNA Consultants now offer complete readings of your DNA. Purchasing your own genetic map has become a popular way to find out what diseases you are prone to, what skills you're best suited for, and even your ethnic ancestry.