Using gene map science to evaluate the genetic map and eliminate disease

## Genetic News

Polycomb group (PcG) and Trithorax group (TrxG) genes encode important regulators of development and differentiation in metazoans. These two groups of genes were discovered in Drosophila by their opposing effects on homeotic gene (Hox) expression. PcG genes collectively behave as genetic repressors of Hox genes, while the TrxG genes are necessary for HOX gene expression or function. Biochemical studies showed that many PcG proteins are present in two protein complexes, Polycomb repressive complexes 1 and 2, which repress transcription via chromatin modifications. TrxG proteins activate transcription via a variety of mechanisms. Here we summarize the large body of genetic and biochemical experiments in Drosophila on these two important groups of genes.

Interactions between Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR) RNAs and CRISPR-associated (Cas) proteins form an RNA-guided adaptive immune system in prokaryotes. The adaptive immune system utilizes segments of the genetic material of invasive foreign elements in the CRISPR locus. The loci are transcribed and processed to produce small CRISPR RNAs (crRNAs), with degradation of invading genetic material directed by a combination of complementarity between RNA and DNA and in some cases recognition of adjacent motifs called PAMs (Protospacer Adjacent Motifs). Here we describe a general, high-throughput procedure to test the efficacy of thousands of targets, applying this to the Escherichia coli type I-E Cascade (CRISPR-associated complex for antiviral defense) system. These studies were followed with reciprocal experiments in which the consequence of CRISPR activity was survival in the presence of a lytic phage. From the combined analysis of the Cascade system, we found that (i) type I-E Cascade PAM recognition is more expansive than previously reported, with at least 22 distinct PAMs, with many of the noncanonical PAMs having CRISPR-interference abilities similar to the canonical PAMs; (ii) PAM positioning appears precise, with no evidence for tolerance to PAM slippage in interference; and (iii) while increased guanine-cytosine (GC) content in the spacer is associated with higher CRISPR-interference efficiency, high GC content (>62.5%) decreases CRISPR-interference efficiency. Our findings provide a comprehensive functional profile of Cascade type I-E interference requirements and a method to assay spacer efficacy that can be applied to other CRISPR-Cas systems.

We examined seizure-susceptibility in a Drosophila model of human epilepsy using optogenetic stimulation of ReaChR (red-activatable channelrhodopsin). Photostimulation of the seizure-sensitive mutant parabss1 causes behavioral paralysis that resembles paralysis caused by mechanical stimulation, in many aspects. Electrophysiology shows that photostimulation evokes abnormal seizure-like neuronal firing in parabss1 followed by a quiescent period resembling synaptic failure and apparently responsible for paralysis. The pattern of neuronal activity concludes with seizure-like activity just prior to recovery. We tentatively identify the mushroom body as one apparent locus of optogenetic seizure initiation. The α/β lobes may be primarily responsible for mushroom body seizure induction.

Mapping-by-sequencing has become a standard method to map and identify phenotype-causing mutations in model species. Here, we show that a fragmented draft assembly is sufficient to perform mapping-by-sequencing in nonmodel species. We generated a draft assembly and annotation of the genome of the free-living nematode Oscheius tipulae, a distant relative of the model Caenorhabditis elegans. We used this draft to identify the likely causative mutations at the O. tipulae cov-3 locus, which affect vulval development. The cov-3 locus encodes the O. tipulae ortholog of C. elegans mig-13, and we further show that Cel-mig-13 mutants also have an unsuspected vulval-development phenotype. In a virtuous circle, we were able to use the linkage information collected during mutant mapping to improve the genome assembly. These results showcase the promise of genome-enabled forward genetics in nonmodel species.

Site-specific recombinases are potent tools to regulate gene expression. In particular, the Cre (cyclization recombination) and FLP (flipase) enzymes are widely used to either activate or inactivate genes in a precise spatiotemporal manner. Both recombinases work efficiently in the popular model organism Caenorhabditis elegans, but their use in this nematode is still only sporadic. To increase the utility of the FLP system in C. elegans, we have generated a series of single-copy transgenic strains that stably express an optimized version of FLP in specific tissues or by heat induction. We show that recombination efficiencies reach 100% in several cell types, such as muscles, intestine, and serotonin-producing neurons. Moreover, we demonstrate that most promoters drive recombination exclusively in the expected tissues. As examples of the potentials of the FLP lines, we describe novel tools for induced cell ablation by expression of the PEEL-1 toxin and a versatile FLP-out cassette for generation of GFP-tagged conditional knockout alleles. Together with other recombinase-based reagents created by the C. elegans community, this toolkit increases the possibilities for detailed analyses of specific biological processes at developmental stages inside intact animals.

Many genetic association studies collect a wide range of complex traits. As these traits may be correlated and share a common genetic mechanism, joint analysis can be statistically more powerful and biologically more meaningful. However, most existing tests for multiple traits cannot be used for high-dimensional and possibly structured traits, such as network-structured transcriptomic pathway expressions. To overcome potential limitations, in this article we propose the dual kernel-based association test (DKAT) for testing the association between multiple traits and multiple genetic variants, both common and rare. In DKAT, two individual kernels are used to describe the phenotypic and genotypic similarity, respectively, between pairwise subjects. Using kernels allows for capturing structure while accommodating dimensionality. Then, the association between traits and genetic variants is summarized by a coefficient which measures the association between two kernel matrices. Finally, DKAT evaluates the hypothesis of nonassociation with an analytical P-value calculation without any computationally expensive resampling procedures. By collapsing information in both traits and genetic variants using kernels, the proposed DKAT is shown to have a correct type-I error rate and higher power than other existing methods in both simulation studies and application to a study of genetic regulation of pathway gene expressions.

A currently popular strategy (EMMAX) for genome-wide association (GWA) analysis infers association for the specific marker of interest by treating its effect as fixed while treating all other marker effects as classical Gaussian random effects. It may be more statistically coherent to specify all markers as sharing the same prior distribution, whether that distribution is Gaussian, heavy-tailed (BayesA), or has variable selection specifications based on a mixture of, say, two Gaussian distributions [stochastic search and variable selection (SSVS)]. Furthermore, all such GWA inference should be formally based on posterior probabilities or test statistics as we present here, rather than merely being based on point estimates. We compared these three broad categories of priors within a simulation study to investigate the effects of different degrees of skewness for quantitative trait loci (QTL) effects and numbers of QTL using 43,266 SNP marker genotypes from 922 Duroc–Pietrain F2-cross pigs. Genomic regions were based either on single SNP associations, on nonoverlapping windows of various fixed sizes (0.5–3 Mb), or on adaptively determined windows that cluster the genome into blocks based on linkage disequilibrium. We found that SSVS and BayesA lead to the best receiver operating curve properties in almost all cases. We also evaluated approximate maximum a posteriori (MAP) approaches to BayesA and SSVS as potential computationally feasible alternatives; however, MAP inferences were not promising, particularly due to their sensitivity to starting values. We determined that it is advantageous to use variable selection specifications based on adaptively constructed genomic window lengths for GWA studies.

Protein modification by the small ubiquitin-like modifier (SUMO) plays important roles in genome maintenance. In Saccharomyces cerevisiae, proper regulation of sumoylation is known to be essential for viability in certain DNA repair mutants. Here, we find the opposite result; proper regulation of sumoylation is lethal in certain DNA repair mutants. Yeast cells lacking the repair factors TDP1 and WSS1 are synthetically lethal due to their redundant roles in removing Top1-DNA covalent complexes (Top1ccs). A screen for suppressors of tdp1 wss1 synthetic lethality isolated mutations in genes known to control global sumoylation levels including ULP1, ULP2, SIZ2, and SLX5. The results suggest that alternative pathways of repair become available when sumoylation levels are altered. Curiously, both suppressor mutations that were isolated in the Slx5 subunit of the SUMO-targeted Ub ligase created new lysine residues. These "slx5-K" mutations localize to a 398 amino acid domain that is completely free of lysine, and they result in the auto-ubiquitination and partial proteolysis of Slx5. The decrease in Slx5-K protein leads to the accumulation of high molecular weight SUMO conjugates, and the residual Ub ligase activity is needed to suppress inviability presumably by targeting polysumoylated Top1ccs. This "lysine desert" is found in the subset of large fungal Slx5 proteins, but not its smaller orthologs such as RNF4. The lysine desert solves a problem that Ub ligases encounter when evolving novel functional domains.

Ovarian function is directly correlated with survival of the primordial follicle reserve. Women diagnosed with cancer have a primary imperative of treating the cancer, but since the resting oocytes are hypersensitive to the DNA-damaging modalities of certain chemo- and radiotherapeutic regimens, such patients face the collateral outcome of premature loss of fertility and ovarian endocrine function. Current options for fertility preservation primarily include the collection and cryopreservation of oocytes or in vitro-fertilized oocytes, but this necessitates a delay in cancer treatment and additional assisted reproductive technology procedures. Here, we evaluated the potential of pharmacological preservation of ovarian function by inhibiting a key element of the oocyte DNA damage checkpoint response, checkpoint kinase 2 (CHK2; CHEK2). Whereas nonlethal doses of ionizing radiation (IR) eradicate immature oocytes in wild-type mice, irradiated Chk2–/– mice retain their oocytes and, thus, fertility. Using an ovarian culture system, we show that transient administration of the CHK2 inhibitor 2-(4-(4-chlorophenoxy)phenyl)-1H-benzimidazole-5-carboxamide-hydrate ("CHK2iII") blocked activation of the CHK2 targets TRP53 and TRP63 in response to sterilizing doses of IR, and preserved oocyte viability. After transfer into sterilized host females, these ovaries proved functional and readily yielded normal offspring. These results provide experimental evidence that chemical inhibition of CHK2 is a potentially effective treatment for preserving the fertility and ovarian endocrine function of women exposed to DNA-damaging cancer therapies such as IR.

The incorporation of the paternal genome into the zygote during fertilization requires chromatin remodeling. The maternal haploid (mh) mutation in Drosophila affects this process and leads to the formation of haploid embryos without the paternal genome. mh encodes the Drosophila homolog of SPRTN, a conserved protease essential for resolving DNA–protein cross-linked products. Here we characterize the role of MH in genome maintenance. It is not understood how MH protects the paternal genome during fertilization, particularly in light of our finding that MH is present in both parental pronuclei during zygote formation. We showed that maternal chromosomes in mh mutant embryos experience instabilities in the absence of the paternal genome, which suggests that MH is generally required for chromosome stability during embryogenesis. This is consistent with our finding that MH is abundantly present on chromatin throughout the cell cycle. Remarkably, MH is prominently enriched at the 359-bp satellite repeats during interphase, which becomes unstable without MH. This dynamic localization and specific enrichment of MH at the 359 repeats resemble that of Topoisomerase 2 (Top2), suggesting that MH regulates Top2, possibly as a protease for the resolution of Top2-DNA intermediates. We propose that maternal MH removes proteins specifically enriched on sperm chromatin. In the absence of that function, paternal chromosomes are precipitously lost. This mode of paternal chromatin remodeling is likely conserved and the unique phenotype of the Drosophila mh mutants represents a rare opportunity to gain insights into the process that has been difficult to study.

Recombination rate is a heritable quantitative trait that evolves despite the fundamentally conserved role that recombination plays in meiosis. Differences in recombination rate can alter the landscape of the genome and the genetic diversity of populations. Yet our understanding of the genetic basis of recombination rate evolution in nature remains limited. We used wild house mice (Mus musculus domesticus) from Gough Island (GI), which diverged recently from their mainland counterparts, to characterize the genetics of recombination rate evolution. We quantified genome-wide autosomal recombination rates by immunofluorescence cytology in spermatocytes from 240 F2 males generated from intercrosses between GI-derived mice and the wild-derived inbred strain WSB/EiJ. We identified four quantitative trait loci (QTL) responsible for inter-F2 variation in this trait, the strongest of which had effects that opposed the direction of the parental trait differences. Candidate genes and mutations for these QTL were identified by overlapping the detected intervals with whole-genome sequencing data and publicly available transcriptomic profiles from spermatocytes. Combined with existing studies, our findings suggest that genome-wide recombination rate divergence is not directional and its evolution within and between subspecies proceeds from distinct genetic loci.

Ionizing radiation (IR) is commonly used in cancer therapy and is a main source of DNA double-strand breaks (DSBs), one of the most toxic forms of DNA damage. We have used Caenorhabditis elegans as an invertebrate model to identify novel factors required for repair of DNA damage inflicted by IR. We have performed an unbiased genetic screen, finding that smg-1 mutations confer strong hyper-sensitivity to IR. SMG-1 is a phosphoinositide-3 kinase (PI3K) involved in mediating nonsense-mediated mRNA decay (NMD) of transcripts containing premature stop codons and related to the ATM and ATR kinases which are at the apex of DNA damage signaling pathways. Hyper-sensitivity to IR also occurs when other genes mediating NMD are mutated. The hyper-sensitivity to bleomycin, a drug known to induce DSBs, further supports that NMD pathway mutants are defective in DSB repair. Hyper-sensitivity was not observed upon treatment with alkylating agents or UV irradiation. We show that SMG-1 mainly acts in mitotically dividing germ cells, and during late embryonic and larval development. Based on epistasis experiments, SMG-1 does not appear to act in any of the three major pathways known to mend DNA DSBs, namely homologous recombination (HR), nonhomologous end-joining (NHEJ), and microhomology-mediated end-joining (MMEJ). We speculate that SMG-1 kinase activity could be activated following DNA damage to phosphorylate specific DNA repair proteins and/or that NMD inactivation may lead to aberrant mRNAs leading to synthesis of malfunctioning DNA repair proteins.

The genetic code converts information from nucleic acid into protein. The genetic code was thought to be immutable, yet many examples in nature indicate that variations to the code provide a selective advantage. We used a sensitive selection system involving suppression of a deleterious allele (tti2-L187P) in Saccharomyces cerevisiae to detect mistranslation and identify mechanisms that allow genetic code evolution. Though tRNASer containing a proline anticodon (UGG) is toxic, using our selection system we identified four tRNASer UGG variants, each with a single mutation, that mistranslate at a tolerable level. Mistranslating tRNALeu UGG variants were also obtained, demonstrating the generality of the approach. We characterized two of the tRNASer UGG variants. One contained a G26A mutation, which reduced cell growth to 70% of the wild-type rate, induced a heat shock response, and was lost in the absence of selection. The reduced toxicity of tRNASer UGG-G26A is likely through increased turnover of the tRNA, as lack of methylation at G26 leads to degradation via the rapid tRNA decay pathway. The second tRNASer UGG variant, with a G9A mutation, had minimal effect on cell growth, was relatively stable in cells, and gave rise to less of a heat shock response. In vitro, the G9A mutation decreases aminoacylation and affects folding of the tRNA. Notably, the G26A and G9A mutations were phenotypically neutral in the context of an otherwise wild-type tRNASer. These experiments reveal a model for genetic code evolution in which tRNA anticodon mutations and mistranslation evolve through phenotypically ambivalent intermediates that reduce tRNA function.

Nonsense-mediated RNA decay (NMD) is a crucial post-transcriptional regulatory mechanism that recognizes and eliminates aberrantly processed transcripts, and mediates the expression of normal gene transcripts. In this study, we report that in the filamentous fungus Neurospora crassa, the NMD factors play a conserved role in regulating the surveillance of NMD targets including premature termination codon (PTC)-containing transcripts and normal transcripts. The circadian rhythms in all of the knockout strains of upf1-3 genes, which encode the Up-frameshift proteins, were aberrant. The upf1 knockout strain displays a shortened circadian period, which can be restored by constantly expressing exogenous Up-frameshift protein 1 (UPF1). UPF1 regulates the circadian clock by modulating the splicing of the core clock gene frequency (frq) through spliceosome and spliceosome-related arginine/serine-rich splicing factors, which partly account for the short periods in the upf1 knockout strain. We also demonstrated that the clock genes including White Collar (WC)-1, WC-2, and FRQ are involved in controlling the diurnal growth rhythm, and UPF1 may affect the growth rhythms by mediating the FRQ protein levels in the daytime. These findings suggest that the NMD factors play important roles in regulating the circadian clock and diurnal growth rhythms in Neurospora.

Previously expressed inducible genes can remain poised for faster reactivation for multiple cell divisions, a conserved phenomenon called epigenetic transcriptional memory. The GAL genes in Saccharomyces cerevisiae show faster reactivation for up to seven generations after being repressed. During memory, previously produced Gal1 protein enhances the rate of reactivation of GAL1, GAL10, GAL2, and GAL7. These genes also interact with the nuclear pore complex (NPC) and localize to the nuclear periphery both when active and during memory. Peripheral localization of GAL1 during memory requires the Gal1 protein, a memory-specific cis-acting element in the promoter, and the NPC protein Nup100. However, unlike other examples of transcriptional memory, the interaction with NPC is not required for faster GAL gene reactivation. Rather, downstream of Gal1, the Tup1 transcription factor and growth in glucose promote GAL transcriptional memory. Cells only show signs of memory and only benefit from memory when growing in glucose. Tup1 promotes memory-specific chromatin changes at the GAL1 promoter: incorporation of histone variant H2A.Z and dimethylation of histone H3, lysine 4. Tup1 and H2A.Z function downstream of Gal1 to promote binding of a preinitiation form of RNA Polymerase II at the GAL1 promoter, poising the gene for faster reactivation. This mechanism allows cells to integrate a previous experience (growth in galactose, reflected by Gal1 levels) with current conditions (growth in glucose, potentially through Tup1 function) to overcome repression and to poise critical GAL genes for future reactivation.

A key aspect of germ cell development is to establish germline sexual identity and initiate a sex-specific developmental program to promote spermatogenesis or oogenesis. Previously, we have identified the histone reader Plant Homeodomain Finger 7 (PHF7) as an important regulator of male germline identity. To understand how PHF7 directs sexual differentiation of the male germline, we investigated the downstream targets of PHF7 by combining transcriptome analyses, which reveal genes regulated by Phf7, with genomic profiling of histone H3K4me2, the chromatin mark that is bound by PHF7. Through these genomic experiments, we identify a novel spermatocyte factor Receptor Accessory Protein Like 1 (REEPL1) that can promote spermatogenesis and whose expression is kept off by PHF7 in the spermatogonial stage. Loss of Reepl1 significantly rescues the spermatogenesis defects in Phf7 mutants, indicating that regulation of Reepl1 is an essential aspect of PHF7 function. Further, increasing REEPL1 expression facilitates spermatogenic differentiation. These results indicate that PHF7 controls spermatogenesis by regulating the expression patterns of important male germline genes.

Heparan sulfates (HS) are linear polysaccharides with complex modification patterns, which are covalently bound via conserved attachment sites to core proteins to form heparan sulfate proteoglycans (HSPGs). HSPGs regulate many aspects of the development and function of the nervous system, including cell migration, morphology, and network connectivity. HSPGs function as cofactors for multiple signaling pathways, including the Wnt-signaling molecules and their Frizzled receptors. To investigate the functional interactions among the HSPG and Wnt networks, we conducted genetic analyses of each, and also between these networks using five cellular migrations in the nematode Caenorhabditis elegans. We find that HSPG core proteins act genetically in a combinatorial fashion dependent on the cellular contexts. Double mutant analyses reveal distinct redundancies among HSPGs for different migration events, and different cellular migrations require distinct heparan sulfate modification patterns. Our studies reveal that the transmembrane HSPG SDN-1/Syndecan functions within the migrating cell to promote cellular migrations, while the GPI-linked LON-2/Glypican functions cell nonautonomously to establish the final cellular position. Genetic analyses with the Wnt-signaling system show that (1) a given HSPG can act with different Wnts and Frizzled receptors, and that (2) a given Wnt/Frizzled pair acts with different HSPGs in a context-dependent manner. Lastly, we find that distinct HSPG and Wnt/Frizzled combinations serve separate functions to promote cellular migration and establish position of specific neurons. Our studies suggest that HSPGs use structurally diverse glycans in coordination with Wnt-signaling pathways to control multiple cellular behaviors, including cellular and axonal migrations and, cellular positioning.

Human psychiatric disorders such as schizophrenia, bipolar disorder, and attention-deficit/hyperactivity disorder often include adverse behaviors including increased aggressiveness. Individuals with psychiatric disorders often exhibit social withdrawal, which can further increase the probability of conducting a violent act. Here, we used the inbred, sequenced lines of the Drosophila Genetic Reference Panel (DGRP) to investigate the genetic basis of variation in male aggressive behavior for flies reared in a socialized and socially isolated environment. We identified genetic variation for aggressive behavior, as well as significant genotype-by-social environmental interaction (GSEI); i.e., variation among DGRP genotypes in the degree to which social isolation affected aggression. We performed genome-wide association (GWA) analyses to identify genetic variants associated with aggression within each environment. We used genomic prediction to partition genetic variants into gene ontology (GO) terms and constituent genes, and identified GO terms and genes with high prediction accuracies in both social environments and for GSEI. The top predictive GO terms significantly increased the proportion of variance explained, compared to prediction models based on all segregating variants. We performed genomic prediction across environments, and identified genes in common between the social environments that turned out to be enriched for genome-wide associated variants. A large proportion of the associated genes have previously been associated with aggressive behavior in Drosophila and mice. Further, many of these genes have human orthologs that have been associated with neurological disorders, indicating partially shared genetic mechanisms underlying aggression in animal models and human psychiatric disorders.

Acute onset of organ failure in heatstroke is triggered by rhabdomyolysis of skeletal muscle. Here, we showed that elevated temperature increases free cytosolic Ca2+ [Ca2+]f from RYR (ryanodine receptor)/UNC-68 in vivo in the muscles of an experimental model animal, the nematode Caenorhabditis elegans. This subsequently leads to mitochondrial fragmentation and dysfunction, and breakdown of myofilaments similar to rhabdomyolysis. In addition, treatment with an inhibitor of RYR (dantrolene) or activation of FoxO (Forkhead box O)/DAF-16 is effective against heat-induced muscle damage. Acute onset of organ failure in heatstroke is triggered by rhabdomyolysis of skeletal muscle. To gain insight into heat-induced muscle breakdown, we investigated alterations of Ca2+ homeostasis and mitochondrial morphology in vivo in body-wall muscles of C. elegans exposed to elevated temperature. Heat stress for 3 hr at 35° increased the concentration of [Ca2+]f, and led to mitochondrial fragmentation and subsequent dysfunction in the muscle cells. A similar mitochondrial fragmentation phenotype is induced in the absence of heat stress by treatment with a calcium ionophore, ionomycin. Mutation of the unc-68 gene, which encodes the ryanodine receptor that is linked to Ca2+ release from the sarcoplasmic reticulum, could suppress the mitochondrial dysfunction, muscle degeneration, and reduced mobility and life span induced by heat stress. In addition, in a daf-2 mutant, in which the DAF-16/FoxO transcription factor is activated, resistance to calcium overload, mitochondrial fragmentation, and dysfunction was observed. These findings reveal that heat-induced Ca2+ accumulation causes mitochondrial damage and consequently induces muscle breakdown.

One of the most fascinating scientific problems, and a subject of intense debate, is that of the mechanisms of biological evolution. In this context, Waddington elaborated the concepts of "canalization and assimilation" to explain how an apparently somatic variant induced by stress could become heritable through the germline in Drosophila. He resolved this seemingly Lamarckian phenomenon by positing the existence of cryptic mutations that can be expressed and selected under stress. To investigate the relevance of such mechanisms, we performed experiments following the Waddington procedure, then isolated and fixed three phenotypic variants along with another induced mutation that was not preceded by any phenocopy. All the fixed mutations we looked at were actually generated de novo by DNA deletions or transposon insertions, highlighting a novel mechanism for the assimilation process. Our study shows that heat-shock stress produces both phenotypic variants and germline mutations, and suggests an alternative explanation to that of Waddington for the apparent assimilation of an acquired character. The selection of the variants, under stress, for a number of generations allows for the coselection of newly induced corresponding germline mutations, making the phenotypic variants appear heritable.

An extended meiotic prophase is a hallmark of oogenesis. Hormonal signaling activates the CDK1/cyclin B kinase to promote oocyte meiotic maturation, which involves nuclear and cytoplasmic events. Nuclear maturation encompasses nuclear envelope breakdown, meiotic spindle assembly, and chromosome segregation. Cytoplasmic maturation involves major changes in oocyte protein translation and cytoplasmic organelles and is poorly understood. In the nematode Caenorhabditis elegans, sperm release the major sperm protein (MSP) hormone to promote oocyte growth and meiotic maturation. Large translational regulatory ribonucleoprotein (RNP) complexes containing the RNA-binding proteins OMA-1, OMA-2, and LIN-41 regulate meiotic maturation downstream of MSP signaling. To understand the control of translation during meiotic maturation, we purified LIN-41-containing RNPs and characterized their protein and RNA components. Protein constituents of LIN-41 RNPs include essential RNA-binding proteins, the GLD-2 cytoplasmic poly(A) polymerase, the CCR4-NOT deadenylase complex, and translation initiation factors. RNA sequencing defined messenger RNAs (mRNAs) associated with both LIN-41 and OMA-1, as well as sets of mRNAs associated with either LIN-41 or OMA-1. Genetic and genomic evidence suggests that GLD-2, which is a component of LIN-41 RNPs, stimulates the efficient translation of many LIN-41-associated transcripts. We analyzed the translational regulation of two transcripts specifically associated with LIN-41 which encode the RNA regulators SPN-4 and MEG-1. We found that LIN-41 represses translation of spn-4 and meg-1, whereas OMA-1 and OMA-2 promote their expression. Upon their synthesis, SPN-4 and MEG-1 assemble into LIN-41 RNPs prior to their functions in the embryo. This study defines a translational repression-to-activation switch as a key element of cytoplasmic maturation.

The micronutrient boron is essential in maintaining the structure of plant cell walls and is critical for high yields in crop species. Boron can move into plants by diffusion or by active and facilitated transport mechanisms. We recently showed that mutations in the maize boron efflux transporter ROTTEN EAR (RTE) cause severe developmental defects and sterility. RTE is part of a small gene family containing five additional members (RTE2RTE6) that show tissue-specific expression. The close paralogous gene RTE2 encodes a protein with 95% amino acid identity with RTE and is similarly expressed in shoot and root cells surrounding the vasculature. Despite sharing a similar function with RTE, mutations in the RTE2 gene do not cause growth defects in the shoot, even in boron-deficient conditions. However, rte2 mutants strongly enhance the rte phenotype in soils with low boron content, producing shorter plants that fail to form all reproductive structures. The joint action of RTE and RTE2 is also required in root development. These defects can be fully complemented by supplying boric acid, suggesting that diffusion or additional transport mechanisms overcome active boron transport deficiencies in the presence of an excess of boron. Overall, these results suggest that RTE2 and RTE function are essential for maize shoot and root growth in boron-deficient conditions.

Hedgehog (Hh) regulates the Cubitus interruptus (Ci) transcription factor in Drosophila melanogaster by activating full-length Ci-155 and blocking processing to the Ci-75 repressor. However, the interplay between the regulation of Ci-155 levels and activity, as well as processing-independent mechanisms that affect Ci-155 levels, have not been explored extensively. Here, we identified Mago Nashi (Mago) and Y14 core Exon Junction Complex (EJC) proteins, as well as the Srp54 splicing factor, as modifiers of Hh pathway activity under sensitized conditions. Mago inhibition reduced Hh pathway activity by altering the splicing pattern of ci to reduce Ci-155 levels. Srp54 inhibition also affected pathway activity by reducing ci RNA levels but additionally altered Ci-155 levels and activity independently of ci splicing. Further tests using ci transgenes and ci mutations confirmed evidence from studying the effects of Mago and Srp54 that relatively small changes in the level of Ci-155 primary translation product alter Hh pathway activity under a variety of sensitized conditions. We additionally used ci transgenes lacking intron sequences or the presumed translation initiation codon for an alternatively spliced ci RNA to provide further evidence that Mago acts principally by modulating the levels of the major ci RNA encoding Ci-155, and to show that ci introns are necessary to support the production of sufficient Ci-155 for robust Hh signaling and may also be important mediators of regulatory inputs.

Snail-like transcription factors affect stem cell function through mechanisms that are incompletely understood. In the Caenorhabditis elegans neurosecretory motor neuron (NSM) neuroblast lineage, CES-1 Snail coordinates cell cycle progression and cell polarity to ensure the asymmetric division of the NSM neuroblast and the generation of two daughter cells of different sizes and fates. We have previously shown that CES-1 Snail controls cell cycle progression by repressing the expression of cdc-25.2 CDC25. However, the mechanism through which CES-1 Snail affects cell polarity has been elusive. Here, we systematically searched for direct targets of CES-1 Snail by genome-wide profiling of CES-1 Snail binding sites and identified >3000 potential CES-1 Snail target genes, including pig-1, the ortholog of the oncogene maternal embryonic leucine zipper kinase (MELK). Furthermore, we show that CES-1 Snail represses pig-1 MELK transcription in the NSM neuroblast lineage and that pig-1 MELK acts downstream of ces-1 Snail to cause the NSM neuroblast to divide asymmetrically by size and along the correct cell division axis. Based on our results we propose that by regulating the expression of the MELK gene, Snail-like transcription factors affect the ability of stem cells to divide asymmetrically and, hence, to self-renew. Furthermore, we speculate that the deregulation of MELK contributes to tumorigenesis by causing cells that normally divide asymmetrically to divide symmetrically instead.

Many population genetic activities, ranging from evolutionary studies to association mapping, to forensic identification, rely on appropriate estimates of population structure or relatedness. All applications require recognition that quantities with an underlying meaning of allelic dependence are not defined in an absolute sense, but instead are made "relative to" some set of alleles other than the target set. The 1984 Weir and Cockerham $${F}_{\mathrm{ST}}$$ estimate made explicit that the reference set of alleles was across populations, whereas standard kinship estimates do not make the reference explicit. Weir and Cockerham stated that their $${F}_{\mathrm{ST}}$$ estimates were for independent populations, and standard kinship estimates have an implicit assumption that pairs of individuals in a study sample, other than the target pair, are unrelated or are not inbred. However, populations lose independence when there is migration between them, and dependencies between pairs of individuals in a population exist for more than one target pair. We have therefore recast our treatments of population structure, relatedness, and inbreeding to make explicit that the parameters of interest involve the differences in degrees of allelic dependence between the target and the reference sets of alleles, and so can be negative. We take the reference set to be the population from which study individuals have been sampled. We provide simple moment estimates of these parameters, phrased in terms of allelic matching within and between individuals for relatedness and inbreeding, or within and between populations for population structure. A multi-level hierarchy of alleles within individuals, alleles between individuals within populations, and alleles between populations, allows a unified treatment of relatedness and population structure. We expect our new measures to have a wide range of applications, but we note that their estimates are sensitive to rare or private variants: some population-characterization applications suggest exploiting those sensitivities, whereas estimation of relatedness may best use all genetic markers without filtering on minor allele frequency.

Mutations are crucial to evolution, providing the ultimate source of variation on which natural selection acts. Due to their key role, the distribution of mutational effects on quantitative traits is a key component to any inference regarding historical selection on phenotypic traits. In this paper, we expand on a previously developed test for selection that could be conducted assuming a Gaussian mutation effect distribution by developing approaches to also incorporate any of a family of heavy-tailed Laplace distributions of mutational effects. We apply the test to detect directional natural selection on five traits along the divergence of Columbia and Landsberg lineages of Arabidopsis thaliana, constituting the first test for natural selection in any organism using quantitative trait locus and mutation accumulation data to quantify the intensity of directional selection on a phenotypic trait. We demonstrate that the results of the test for selection can depend on the mutation effect distribution specified. Using the distributions exhibiting the best fit to mutation accumulation data, we infer that natural directional selection caused divergence in the rosette diameter and trichome density traits of the Columbia and Landsberg lineages.

Males and females exhibit highly dimorphic phenotypes, particularly in their gonads, which is believed to be driven largely by differential gene expression. Typically, the protein sequences of genes upregulated in males, or male-biased genes, evolve rapidly as compared to female-biased and unbiased genes. To date, the specific study of gonad-biased genes remains uncommon in metazoans. Here, we identified and studied a total of 2927, 2013, and 4449 coding sequences (CDS) with ovary-biased, testis-biased, and unbiased expression, respectively, in the yellow fever mosquito Aedes aegypti. The results showed that ovary-biased and unbiased CDS had higher nonsynonymous to synonymous substitution rates (dN/dS) and lower optimal codon usage (those codons that promote efficient translation) than testis-biased genes. Further, we observed higher dN/dS in ovary-biased genes than in testis-biased genes, even for genes coexpressed in nonsexual (embryo) tissues. Ovary-specific genes evolved exceptionally fast, as compared to testis- or embryo-specific genes, and exhibited higher frequency of positive selection. Genes with ovary expression were preferentially involved in olfactory binding and reception. We hypothesize that at least two potential mechanisms could explain rapid evolution of ovary-biased genes in this mosquito: (1) the evolutionary rate of ovary-biased genes may be accelerated by sexual selection (including female–female competition or male–mate choice) affecting olfactory genes during female swarming by males, and/or by adaptive evolution of olfactory signaling within the female reproductive system (e.g., sperm-ovary signaling); and/or (2) testis-biased genes may exhibit decelerated evolutionary rates due to the formation of mating plugs in the female after copulation, which limits male–male sperm competition.

Cis- and trans-regulatory mutations are important contributors to transcriptome evolution. Quantifying their relative contributions to intraspecific variation in gene expression is essential for understanding the population genetic processes that underlie evolutionary changes in gene expression. Here, we have examined this issue by quantifying genome-wide, allele-specific expression (ASE) variation using a crossing scheme that produces F1 hybrids between 18 different Drosophila melanogaster strains sampled from the Drosophila Genetic Reference Panel and a reference strain from another population. Head and body samples from F1 adult females were subjected to RNA sequencing and the subsequent ASE quantification. Cis- and trans-regulatory effects on expression variation were estimated from these data. A higher proportion of genes showed significant cis-regulatory variation (~28%) than those that showed significant trans-regulatory variation (~9%). The sizes of cis-regulatory effects on expression variation were 1.98 and 1.88 times larger than trans-regulatory effects in heads and bodies, respectively. A generalized linear model analysis revealed that both cis- and trans-regulated expression variation was strongly associated with nonsynonymous nucleotide diversity and tissue specificity. Interestingly, trans-regulated variation showed a negative correlation with local recombination rate. Also, our analysis on proximal transposable element (TE) insertions suggested that they affect transcription levels of ovary-expressed genes more pronouncedly than genes not expressed in the ovary, possibly due to defense mechanisms against TE mobility in the germline. Collectively, our detailed quantification of ASE variations from a natural population has revealed a number of new relationships between genomic factors and the effects of cis- and trans-regulatory factors on expression variation.

Mobile genetic elements can be found in almost all genomes. Possibly the most common nonautonomous mobile genetic elements in bacteria are repetitive extragenic palindromic doublets forming hairpins (REPINs) that can occur hundreds of times within a genome. The sum of all REPINs in a genome can be viewed as an evolving population because REPINs replicate and mutate. In contrast to most other biological populations, we know the exact composition of the REPIN population and the sequence of each member of the population. Here, we model the evolution of REPINs as quasispecies. We fit our quasispecies model to 10 different REPIN populations from 10 different bacterial strains and estimate effective duplication rates. Our estimated duplication rates range from ~5 x 10–9 to 15 x 10–9 duplications per bacterial generation per REPIN. The small range and the low level of the REPIN duplication rates suggest a universal trade-off between the survival of the REPIN population and the reduction of the mutational load for the host genome. The REPIN populations we investigated also possess features typical of other natural populations. One population shows hallmarks of a population that is going extinct, another population seems to be growing in size, and we also see an example of competition between two REPIN populations.

It is common to find that major-effect genes are an important cause of variation in susceptibility to infection. Here we have characterized natural variation in a gene called pastrel that explains over half of the genetic variance in susceptibility to the Drosophila C virus (DCV) in populations of Drosophila melanogaster. We found extensive allelic heterogeneity, with a sample of seven alleles of pastrel from around the world conferring four phenotypically distinct levels of resistance. By modifying candidate SNPs in transgenic flies, we show that the largest effect is caused by an amino acid polymorphism that arose when an ancestral threonine was mutated to alanine, greatly increasing resistance to DCV. Overexpression of the ancestral, susceptible allele provides strong protection against DCV; indicating that this mutation acted to improve an existing restriction factor. The pastrel locus also contains complex structural variation and cis-regulatory polymorphisms altering gene expression. We find that higher expression of pastrel is associated with increased survival after DCV infection. To understand why this variation is maintained in populations, we investigated genetic variation surrounding the amino acid variant that is causing flies to be resistant. We found no evidence of natural selection causing either recent changes in allele frequency or geographical variation in frequency, suggesting that this is an old polymorphism that has been maintained at a stable frequency. Overall, our data demonstrate how complex genetic variation at a single locus can control susceptibility to a virulent natural pathogen.

Organisms engage in extensive cross-species molecular dialog, yet the underlying molecular actors are known for only a few interactions. Many techniques have been designed to uncover genes involved in signaling between organisms. Typically, these focus on only one of the partners. We developed an expression quantitative trait locus (eQTL) mapping-based approach to identify cause-and-effect relationships between genes from two partners engaged in an interspecific interaction. We demonstrated the approach by assaying expression of 98 isogenic plants (Medicago truncatula), each inoculated with a genetically distinct line of the diploid parasitic nematode Meloidogyne hapla. With this design, systematic differences in gene expression across host plants could be mapped to genetic polymorphisms of their infecting parasites. The effects of parasite genotypes on plant gene expression were often substantial, with up to 90-fold (P = 3.2 x 10–52) changes in expression levels caused by individual parasite loci. Mapped loci included a number of pleiotropic sites, including one 87-kb parasite locus that modulated expression of >60 host genes. The 213 host genes identified were substantially enriched for transcription factors. We distilled higher-order connections between polymorphisms and genes from both species via network inference. To replicate our results and test whether effects were conserved across a broader host range, we performed a confirmatory experiment using M. hapla-infected tomato. This revealed that homologous genes were similarly affected. Finally, to validate the broader utility of cross-species eQTL mapping, we applied the strategy to data from a Salmonella infection study, successfully identifying polymorphisms in the human genome affecting bacterial expression.

The genetic basis of stochastic variation within a defined environment, and the consequences of such micro-environmental variance for fitness are poorly understood. Using a multigenerational breeding design in Drosophila serrata, we demonstrated that the micro-environmental variance in a set of morphological wing traits in a randomly mating population had significant additive genetic variance in most single wing traits. Although heritability was generally low (<1%), coefficients of additive genetic variance were of a magnitude typical of other morphological traits, indicating that the micro-environmental variance is an evolvable trait. Multivariate analyses demonstrated that the micro-environmental variance in wings was genetically correlated among single traits, indicating that common mechanisms of environmental buffering exist for this functionally related set of traits. In addition, through the dominance genetic covariance between the major axes of micro-environmental variance and fitness, we demonstrated that micro-environmental variance shares a genetic basis with fitness, and that the pattern of selection is suggestive of variance-reducing selection acting on micro-environmental variance.

An ongoing challenge in biology is to predict the phenotypes of individuals from their genotypes. Genetic variants that cause disease often change an individual’s total metabolite profile, or metabolome. In light of our extensive knowledge of metabolic pathways, genetic variants that alter the metabolome may help predict novel phenotypes. To link genetic variants to changes in the metabolome, we studied natural variation in the yeast Saccharomyces cerevisiae. We used an untargeted mass spectrometry method to identify dozens of metabolite Quantitative Trait Loci (mQTL), genomic regions containing genetic variation that control differences in metabolite levels between individuals. We mapped differences in urea cycle metabolites to genetic variation in specific genes known to regulate amino acid biosynthesis. Our functional assays reveal that genetic variation in two genes, AUA1 and ARG81, cause the differences in the abundance of several urea cycle metabolites. Based on knowledge of the urea cycle, we predicted and then validated a new phenotype: sensitivity to a particular class of amino acid isomers. Our results are a proof-of-concept that untargeted mass spectrometry can reveal links between natural genetic variants and metabolome diversity. The interpretability of our results demonstrates the promise of using genetic variants underlying natural differences in the metabolome to predict novel phenotypes from genotype.

How essential, regulatory genes originate and evolve is intriguing because mutations of these genes not only lead to lethality in organisms, but also have pleiotropic effects since they control the expression of multiple downstream genes. Therefore, the evolution of essential, regulatory genes is not only determined by genetic variations of their own sequences, but also by the biological function of downstream genes and molecular mechanisms of regulation. To understand the origin of essential, regulatory genes, experimental dissection of the complete regulatory cascade is needed. Here, we provide genetic evidences to reveal that PhoP-PhoQ is an essential two-component signal transduction system in the gram-negative bacterium Xanthomonas campestris, but that its orthologs in other bacteria belonging to Proteobacteria are nonessential. Mutational, biochemical, and chromatin immunoprecipitation together with high-throughput sequencing analyses revealed that phoP and phoQ of X. campestris and its close relative Pseudomonas aeruginosa are replaceable, and that the consensus binding motifs of the transcription factor PhoP are also highly conserved. PhoPXcc in X. campestris regulates the transcription of a number of essential, structural genes by directly binding to cis-regulatory elements (CREs); however, these CREs are lacking in the orthologous essential, structural genes in P. aeruginosa, and thus the regulatory relationships between PhoPPae and these downstream essential genes are disassociated. Our findings suggested that the recruitment of regulatory proteins by critical structural genes via transcription factor-CRE rewiring is a driving force in the origin and functional divergence of essential, regulatory genes.

### Genetic Benefits

The techniques developed for genetic mapping have had great impact on the life sciences, and particularly in medicine. But genetic mapping technologies also have useful applications in other fields...