**Evolutionary insights into disassortative mating and its association to an ecologically relevant supergene**

**Charles Mullon**based on reviews by Tom Van Dooren and 2 anonymous reviewers### Evolution and genetic architecture of disassortative mating at a locus under heterozygote advantage

**Abstract**

*Submitted: 29 October 2019*,

*Recommended: 26 September 2020*

#### Recommendation

*Heliconius* butterflies are famous for their colorful wing patterns acting as a warning of their chemical defenses [1]. Most species are involved in Müllerian mimicry assemblies, as predators learn to associate common wing patterns with unpalatability and preferentially target rare variants. Such positive-frequency dependent selection homogenizes wing patterns at different localities, and in several species, all individuals within a community belong to the same morph [2]. In this respect, *H. numata* stands out. This species shows stable local polymorphism across multiple localities, with local populations home to up to seven distinct morphs [2]. Although a balance between migration and local positive-frequency dependent selection can allow some degree of local polymorphism, theory suggests that this occurs only when migration is within a narrow window [3].

One factor that potentially enhances local polymorphism in *H. numata* is disassortative mating. Mate choice assays have in fact revealed that females of this species tend to reject males with the same wing pattern [4]. However the evolution of such mating behavior and its effect on polymorphism remain unclear when selection is locally positive-frequency dependent. Using a mathematical model, Maisonneuve *et al.* [5] clarify the conditions that favor the evolution of disassortative mating in the complicated system of *H. numata*. In particular, they investigate whether the genetic basis of wing colour can favor the emergence of disassortative mating. Variation in wing pattern in *H. numata* is controlled by the supergene P, which is a single genomic region harboring multiple protein coding genes that have ceased to recombine due to chromosomal inversions [6]. If such remarkable genetic configuration allows for the co-adaptation of multiple loci participating to a complex phenotype such as wing color pattern, the absence of recombination can also result in the accumulation of deleterious mutations [7]. In fact, alleles at the P locus have been associated with a recessive genetic load, leading to a fitness advantage for heterozygotes at this locus [8]. Can this fitness advantage to heterozygotes lead to the evolution of disassortative mating? And if so, can such evolution lead to the maintenance of local polymorphism in spite of strong positive frequency-dependent selection?

To investigate these questions, Maisonneuve *et al.* [5] model evolution at two loci, one is the P locus for wing pattern, and the other influences mating behavior. The population is divided among two connected patches that differ in their butterfly communities, so that different alleles at the P locus are favored by positive frequency-dependent selection in different patches. The different alleles at the P locus are ordered in dominance relationships such that the most dominant over wing color pattern are also those with the highest load. By tracking the dynamics of haplotype frequencies in the population, the authors first show that disassortative mating readily evolves via the invasion of an allele causing females carrying it to reject males that resemble them phenotypically. Such “self-referencing” mechanism of mate choice, however, has never been reported and has been argued to be rare due to its complicated nature [9].

Maisonneuve *et al.* [5] then compare the evolution of disassortative mating via two alternative mechanisms: attraction and rejection. In these cases, alleles at the mating locus determine attraction to or rejection of specific phenotypes (e.g., under attraction rule, allele “B” encodes attraction to males with phenotype B). With the P and mating loci fully linked, disassortative mating can evolve under all three mechanisms (self-referencing, attraction and rejection), but tends to be less prevalent at equilibrium under attraction rule. This in turn results in the maintenance of less genetic variation under attraction compared to the other mating mechanisms. The loss of variation that occurs under attraction rules is due to a combination of dominance relationships between alleles at the P locus and the searching cost to females in finding rare types of males. When a particular wing pattern, say B, is only expressed in homozygotic form, B males are relatively rare. Females that carry the allele at the mating locus causing them to be attracted to such males then suffer a fitness cost due to lost mating opportunities. This mating allele is therefore purged, and in turn so is the recessive allele for B phenotype at the P locus. Under self-referencing and rejection rules, however, choosy females only reject males of a specific phenotype. They can therefore potentially mate with larger pool of males than females attracted to a single type. As a result, self-referencing and rejection rules are less sensitive to demographic effects and so are more conducive to disassortative mating evolution.

In their final analysis, Maisonneuve *et al.* [5] investigate the influence of recombination among the P and mating loci. They show that recombination has different effects on disassortative mating evolution depending on the mechanism of mate choice. Under the self-referencing rule, loose linkage leads to higher levels of disassortative mating and polymorphism than when linkage is tight. Under attraction or rejection rule, however, even very limited recombination completely inhibits the evolution of disassortative mating. This is because, with alleles at the mating locus coding for attraction/rejection to specific males, recombination breaks the association between the P and mating loci necessary for disassortative mating. By contrast, disassortative mating via a self-referencing rule does not depend on the linkage among the P and mating loci: females choose males that are different to themselves independently from the alleles they carry at the P locus.

Taken together, Maisonneuve *et al.*’s analyses [5] show that disassortative mating can readily evolve in a system like *H. numata*, but that this evolution depends on the genetic architecture of mating behavior. The architectures that are more conducive to the evolution of disassortative mating are: (1) epistatic interactions among the P and mating loci such that females are able to recognize their own phenotype and base their mating decision upon this information (self-referencing rule); and (2) full linkage among the P supergene and a mating locus that triggers rejection of a specific color pattern. While the mechanisms behind disassortative mating remain to be elucidated, assortative mating seems to rely on alleles triggering attraction to specific cues with variation in attraction and cues linked together [10]. These observations support the notion that disassortative mating is due to alleles causing rejection, in tight linkage to the P locus. If so, mating loci would in fact be part of the P supergene, thus controlling not only intricate wing color pattern but also mating behavior.

Beyond the specific system of *H. numata*, Maisonneuve *et al.*’s study [5] helps understand the evolution of disassortative mating and its association with the genetic architecture of correlated traits. In particular, Maisonneuve *et al.* [5] expands the role of supergenes for ecologically relevant traits to mating behavior, further bolstering the relevance of these remarkable genetic elements in the maintenance of variation in complex and elaborate phenotypes.

**References**

[1] Merrill, R M, K K Dasmahapatra, J W Davey, D D Dell'Aglio, J J Hanly, B Huber, C D Jiggins, et al. (2015). The Diversification of Heliconius butterflies: What Have We Learned in 150 Years? Journal of Evolutionary Biology 28 (8), 1417–38. https://doi.org/10.1111/jeb.12672.

[2] Joron M, IR Wynne, G Lamas, and J Mallet (1999). Variable selection and the coexistence of multiple mimetic forms of the butterfly Heliconius numata. Evolutionary Ecology 13, 721– 754. https://doi.org/10.1023/A:1010875213123

[3] Joron M and Y Iwasa (2005). The evolution of a Müllerian mimic in a spatially distributed community. Journal of Theoretical Biology 237, 87–103. https://doi.org/10.1016/j.jtbi.2005.04.005

[4] Chouteau M, V Llaurens, F Piron-Prunier, and M Joron (2017). Polymorphism at a mimicry su- pergene maintained by opposing frequency-dependent selection pressures. Proceedings of the National Academy of Sciences 114, 8325–8329. https://doi.org/10.1073/pnas.1702482114

[5] Maisonneuve, L, Chouteau, M, Joron, M and Llaurens, V. (2020). Evolution and genetic architecture of disassortative mating at a locus under heterozygote advantage. bioRxiv, 616409, ver. 9 peer-reviewed and recommended by PCI Evolutionary Biology. https://doi.org/10.1101/616409

[6] Joron M, L Frezal, RT Jones, NL Chamberlain, SF Lee, CR Haag, A Whibley, M Becuwe, SW Baxter, L Ferguson, et al. (2011). Chromosomal rearrangements maintain a polymorphic super- gene controlling butterfly mimicry. Nature 477, 203. https://doi.org/10.1038/nature10341

[7] Schwander T, R Libbrecht, and L Keller (2014). Supergenes and Complex Phenotypes.” Current Biology. 24 (7), 288–94. https://doi.org/10.1016/j.cub.2014.01.056.

[8] Jay P, M Chouteau, A Whibley, H Bastide, V Llaurens, H Parrinello, and M Joron (2019). Mutation accumulation in chromosomal inversions maintains wing pattern polymorphism in a butterfly. bioRxiv. https://doi.org/ 10.1101/736504.

[9] Kopp M, MR Servedio, TC Mendelson, RJ Safran, RL Rodrıguez, ME Hauber, EC Scordato, LB Symes, CN Balakrishnan, DM Zonana, et al. (2018). Mechanisms of assortative mating in speciation with gene flow: connecting theory and empirical research. The American Naturalist 191, 1–20. https://doi.org/10.1086/694889

[10] Merrill RM, P Rastas, SH Martin, MC Melo, S Barker, J Davey, WO McMillan, and CD Jiggins (2019). Genetic dissection of assortative mating behavior. PLoS biology 17, e2005902. https://doi.org/10.1371/journal.pbio.2005902

**Cite this recommendation as:**

Charles Mullon (2020) Evolutionary insights into disassortative mating and its association to an ecologically relevant supergene.

*Peer Community in Evolutionary Biology, 100109.*

**10.24072/pci.evolbiol.100109**

*Evaluation round ***#4**

**#4**

*22 Jul 2020*DOI or URL of the preprint: **doi: https://doi.org/10.1101/616409**

Version of the preprint: October 29, 2019.

#### Author's Reply

#### Decision by **Charles Mullon**

Dear authors,

Thank you for your revision and clarifications. However, there are still spelling mistakes and grammatical problems with your most recent changes to the manuscript (listed below). This is somewhat frustrating as these could have easily been avoided. I therefore have to ask for yet another revision. This should really be the last so please ensure that your next manuscript is of the best possible quality.

All the best,

Charles.

l. 244: remove the repeated symbol F*{i,n} at the end of the sentence
l. 244: “this frequencies” - > “these frequencies”
l. 285: the symbol “element of” is still missing in the denominator
l. 301: the subscript n is missing in f*{i,n}

Eq. 14: the symbol “element of” is still missing in the denominator

l. 405: a verb is missing in the phrase: “therefore selection acting against disassortative preferences be reduced”

l. 448: delete the word “that” in the phrase: “and that the genetic load associated”

l. 560: “whereby manipulating of the phenotype”, either “manipulation of” or “manipulating the phenotype”

l. 560: “the chooser change its preference” - > “the chooser changeS its preference”

**Additional requirements of the managing board**:

As indicated in the 'How does it work?’ section and in the code of conduct, please make sure that:

-Data are available to readers, either in the text or through an open data repository such as Zenodo (free), Dryad or some other institutional repository. Data must be reusable, thus metadata or accompanying text must carefully describe the data.

-Details on quantitative analyses (e.g., data treatment and statistical scripts in R, bioinformatic pipeline scripts, etc.) and details concerning simulations (scripts, codes) are available to readers in the text, as appendices, or through an open data repository, such as Zenodo, Dryad or some other institutional repository. The scripts or codes must be carefully described so that they can be reused.

-Details on experimental procedures are available to readers in the text or as appendices.

-Authors have no financial conflict of interest relating to the article. The article must contain a "Conflict of interest disclosure" paragraph before the reference section containing this sentence: "The authors of this preprint declare that they have no financial conflict of interest with the content of this article." If appropriate, this disclosure may be completed by a sentence indicating that some of the authors are PCI recommenders: “XXX is one of the PCI XXX recommenders.”

*Evaluation round ***#3**

**#3**

*27 Jun 2020*DOI or URL of the preprint: **doi: https://doi.org/10.1101/616409**

Version of the preprint: October 29, 2019.

#### Author's Reply

#### Decision by **Charles Mullon**

Dear Ludovic Maisonneuve and coauthors,

Thank you for submitting your revised version. The manuscript has greatly improved, and I would be happy to recommend it pending some corrections and clarifications that I detail below.

Looking forward to reading your final version,

Charles.

My comments are in the order in which they appear in the manuscript:

l. 17 “one allele at trait locus” -> one allele at the trait locus

l. 25 “framwork” -> framework

l. 42 “disassortative mating based on odors is also known ”: the word also is confusing as this is the first time you refer to disassortative mating based on odor cues.

l. 129 separate the variables n and N*{i,n} by a comma
l. 175 “speices” -> species
l. 178 something weird with “i.e.”. Also, italise it as you do so throughout the rest of the manuscript.
l. 207 (third line of first paragraph of section) “Predator” has a capital letter in the middle of a sentence.
Equation (4): Shouldn’t the expression on the RHS of this equation be multiplied by N*{i,n} ?

Equation (7): It would be clearer if you defined “F

*{i,n}” immediately after the equation, as in “equation, where F*{i,n} is …”

l. 250 “see Fig S1” Don’t you mean S5?

l. 280 inline equation for P

*{J,n}: the “element of” symbol is missing under the sums.*

l. 285 “when this cost is low”: It would be clearer to say absent as you mention this refers to a situation where cr = 0.

l. 285 “when this cost is low”: It would be clearer to say absent as you mention this refers to a situation where c

l. 292 I am still confused by the definition of M

*{i,n} as “fertility” and its ratio to mean M, M*{i,n}/M, as the “relative probability that a female has mated”. In your response to reviewers, you specified M

*{i,n}/M is a conditional mating probability. However, isn’t M*{i,n}/M rather an unconditional mating probability, which is 1 in the absence of cost to being choosy (cr = 0), and which decreases according to mate availability and competition among females for access to male mates (when cr > 0)? Please clarify this and I would advise in any case to finding a term different to “fertility” for M

*{i,n} as it only relates to mating. Equation (13): In connection to my previous point, I think that while M*{j,n}/M is simply the probability of mating (rather than “relative”), the last term is a conditional probability, i.e., it is “the probability that, given that female j has mated, its mate is a male with genotype k”.

l. 308 “the sum of all F

*{i,n}” -> the sum of F*{i,n} over all i

l. 316 are parameters chosen to maintain polymorphism in the absence of disassortative mating? If so, please specify this.

Equation (14): the “element of” symbol is missing under the sums.

l. 351 There is an extra close brackets symbol at the end of the sentence before the equation.

l. 401, last sentence of paragraph. Clarify that this is an expectation from the above results, rather than a result established in this paragraph.

l. 445 The statement “disassortative mating is favoured when the genetic load is associated with the dominant alleles only” is misleading as Figure 3 shows that the frequency of the allele for disassortative mating is greatest when the genetic load associated with the recessive allele is intermediate (delta

*c ~ 0.35).*

l. 463 “the non-mimetic allele c, not associated with any genetic load, is preferentially linked with the random mating allele r, probably because the cost of choosiness limits the association with the preference allele in absence of heterozygote advantage.” But isn’t there some heterosis for allele c as well given that allele c when present in homozygotic form suffers increased predation in both patches?

l. 468 “allele b being lost” Please explain why allele b is lost.

l. 494 “one allele a and another allele” since it is always allele c in this case, why not just say one allele a and one allele c ?

l. 495 Similarly, isn’t “another preference allele” always ma?

l. 463 “the non-mimetic allele c, not associated with any genetic load, is preferentially linked with the random mating allele r, probably because the cost of choosiness limits the association with the preference allele in absence of heterozygote advantage.” But isn’t there some heterosis for allele c as well given that allele c when present in homozygotic form suffers increased predation in both patches?

l. 468 “allele b being lost” Please explain why allele b is lost.

l. 494 “one allele a and another allele” since it is always allele c in this case, why not just say one allele a and one allele c ?

l. 495 Similarly, isn’t “another preference allele” always m

l. 502 “hyp 1.a” isn’t is “hyp 1”?

l. 508 “the strength of the genetic load does impact the speed of evolution of disassortative mating”. In what way? Please be more specific here. Also, Fig S10 only shows proportions after 100 generation and it is hard to see anything significantly different. I would have expected a graph of the proportion of self-avoidance against time, showing that it stabilizes at different rates under different scenarios. Would it be possible to generate such a graph?

l. 516 “we observed […] that disassortative mating is favored by self-referencing (hyp. 1) or by rejection (hyp. 2.b) rules” I would specify that disassortative mating becomes more prevalent under hyp. 1 and hyp 2b than hyp 2a (as self-avoidance behaviour still evolves under hyp. 2.a, albeit reaching lower proportions at equilibrium).

l. 539 I found the statement “Our results suggests that single-locus architecture or tight linkage between the cue and the preferences are likely to promote the evolution of disassortative mating” confusing as this is only true under hyp. 2. Please also clarify the next sentence about why you think that even though you modelled phenotype and self-referencing as a two loci, it can be viewed as a single locus architecture.

l. 553 “However, these haplotypes do not necessarily imply a complete self-avoidance behavior in females carrying them.” Is this due to your assumption of co-dominance? If so, please specify it.

l. 568. “Contrastingly, our model suggests that the genetic architecture of disassortative mating might differ from those documented in species showing assortative mating behavior.” I found this sentence a bit vague and difficult to tie with the rest of the paragraph. Is this because your results suggest that under some circumstances, recombination between trait and mating loci can lead to more self-avoidance behaviours in the population, whereas assortative mating is most often associated with linkage between trait and mating loci? Please be more specific.

l. 581. I have found this paragraph discussing the effect of dominance difficult to follow and wasn’t sure what I am expected to retain about it. Perhaps the structure of this pargapragh (where you go back and forth between the hypotheses of recognition/trait rules and self-referencing, rather than treating each hypothesis sequentially) makes it difficult for the reader to keep up with your argument.

l. 599 “As mentioned below” -> above maybe?

l. 613 “Similarly, the model of Karlin and Feldman (1968) suggests that disassortative mating slows down the purge of deleterious alleles.” Could you be more specific about the results of this study? Also, is it only “suggested”?

l. 639 I found the distinction between “balancing selection promoting local polymorphism and heterozygote advantage” a bit confusing because isn’t it the case that in your model, local polymorphism (i.e., within each subpopulation) is driven by heterozygote advantage due to recessive deleterious mutations?

*Evaluation round ***#2**

**#2**

*14 Apr 2020*DOI or URL of the preprint: **doi: https://doi.org/10.1101/616409**

Version of the preprint: October 29, 2019.

#### Author's Reply

#### Decision by **Charles Mullon**

Dear authors,

Thank you for resubmitting your preprint entitled “Evolution and genetic architecture of disassortative mating at a locus under heterozygote advantage” to PCI Evol Biol for recommendation. Your revised preprint was sent to the three reviewers who had commented the first version of your manuscript. As you will see from their comments, two are on the whole satisfied with your changes (although both point out that the manuscript still reads rough in many places). Another reviewer is more critical and in particular has highlighted many sections in your methods that still lack clarity.

After reading your manuscript, I agree with the assessment that your model remains opaque and difficult to understand, in spite of this being the main criticism made about your first version. Your model and methods are still clouded by a lack of explanations and rigorous definitions, combined with unconventional mathematical notations. Reviewer 3’s thorough analysis points out many of these problems. I have also added several comments myself below. All three reviewers have also noticed many typos and other writing issues.

The policy of PCI Evol Biol is very clear: recommended preprints must be of excellent quality, that is to say, to the standard of articles published in the best journals in our field. From the above considerations, I am sorry that I am unable to recommend your preprint as is. If you decide to resubmit to PCI Evol Biol, please provide a detailed point-by-point response to all comments and revise your manuscript accordingly. You may also find it helpful to have your article professionally proofread.

Sincerely yours,

Charles Mullon.

Comments

- The general structure of your model is still not clear. From eq. 1, one would assume that your model consists of just 8 differential equations (4 genotypes * 2 patches). Appendix S1 however suggests that you in fact track haplotypes rather than genotypes. So please state clearly at the beginning the structure of your model, clearly defining all variables involved. In short, readers need to know all the dynamic variables that you use in your numerical simulations in order to be able to replicate them.
- L.117, the ODE approach your model takes suggests that you track the “density” rather than the “number of individuals”. This is for example what is mentioned in the model by Joron and Iwasa, 2007, JTB, on which the current analysis is based. In general, I have found this previous model much more rigorously and clearly explained than your current one. You may therefore find it useful to use the same terms and structure of explanation as Joron and Iwasa, 2007, JTB.
- Like reviewer 3 I am baffled by the use of “d” on the right hand side of eq. 1. In addition, this is defined later as a parameter for predation (eq. 3). In fact, quite often you use the same or very similar notation to refer to different things (e.g., P_i is the contribution of predation to the rate of change in density in eq. 1 but also as “as the proportion of the phenotype” in eq. 7; the use of a dash ‘ next to a variable is not consistent, with pop’ meaning population other than pop, and f’ frequency in offspring rather than f in parents). Simultaneously, you also use different notations to refer to similar things (e.g., matrix notation is sometimes with subscript with or without commas, and further with or without square brackets). This is not an exhaustive list, so please go over your model carefully and make sure that notations are consistent throughout. I also strongly recommend providing a table with all variables and parameters of the model.
- Eq. 3 is overly complicated. First the denominator should simply be lambda*N
*i,pop. Secondly, I don’t understand why in the numerator you introduce the parameter sigma when it is later fixed in all your analyses as 0.5. Why not simply have D*i,pop like in Joron and Iwasa, 2007, JTB? - Eq. 4, perhaps write out the intermediate step ( = - mig
*N_i,pop + mig*N_i,pop’ ). - The main difficulty in understanding the model concerns the sections “mate preference” and “Reproduction” which require extensive revisions. Reviewer 3 points out many of the problems with these two sections (including equations 5-6 that appear out of nowhere without defining variables, eq. 8 that comes before the sentence explain eq. 7 is finished). My suggestion to aid readers is to start with eq. 10 (which is kind of clear but see point 7 below), then explain how the frequency of offspring of different genotypes is calculated, to finally introduce the notions of preferences, mating probabilities and recombination (also please choose carefully what you mean by “fertility”).
- The way I read eq. 10, is that due to density dependent competition, each offspring has a probability of survival given by r and K. In this case, I expect the term r
*(1-N_tot,pop/K) to be multiplied to “the abundance of offspring with genotype i”. However, in eq. 10 it is multiplied by N_tot,pop*f’*i,pop , where N*tot,pop is the total abundance of adults, while f’_i,pop is the frequency of genotype i in offspring. What is the meaning of the multiplication of these two terms in the context of eq. 10? - More generally, and as pointed out by reviewer three, make sure that the distinction between frequency and density (or abundance) is clear and that quantities defined as frequencies are in fact frequencies (e.g., does the sum of eq. 9 over I really to sum to one?)
- Eq. 9. What is “relative” probability?
- Eq. 9. Shouldn’t the denominator for the probability that male k is the father be F_k,pop?
- Methods section. Please clarify your methods of numerical analysis as advised by reviewer 3. E.g. Where can one find the whole dynamical system you are solving (or at least find all the relevant building blocks)? How many time steps? What initial conditions?
- I did not review your results in detail as I don’t understand the model. I suggest that you aim for the same level of clarity in your results and the one I am advising for your model. So please, revise your results as extensively as required by reviewers’ comments.

#### Reviewed by **Tom Van Dooren**, 19 Mar 2020

This is my review of the revision "" by Maisonneuve et al. My previous remarks have been dealt with adequately. I am pleased by the changes made by the authors in this revision. However, I have many remaining small issues with the writing. Please also use English or American spelling consistently. I didn't make that choice in my suggested corrections.

Specific comments

L17: behaviors -> behavior

L20: Please remove "that are starting to emerge"

L23: traits -> trait

L25: trait -> traits

L36: fungy -> fungi

L36: change sentences to: "Disassortative mating can be based on different traits. In Amphidromus inversus snails,…" Write species names such as Amphidromus inversus italic.

L41: "controlling for" refers to correcting for nuisance variables in statistics.

"controlling for variations in" -> "affecting the"

Also check the legend with figure one, three, five. L411.

L44: promotes -> promote

L46: heterozygotes -> heterozygote

L49: differ -> differs

L56: for which -> where; controls for -> determines

L58: contains -> contain

L70: when individual used -> when an individual uses

L76: toward -> towards

L80 locus -> loci

L84: population -> populations

L97: promotes -> promote

L98: such -> a

L110: based on recognition -> based on a recognition

L111: impact -> impacts

One line below L114: on earlier -> on an earlier

L116: controlling variations in wing colour pattern -> controlling wing colour patterns

L117: the number -> the number of individuals

L118: populations -> population

L120: described -> describe

L121: populations -> population

L133: pattern -> patterns

L139: "The environment … " -> "Local communities of species involved in mimicry (i.e. mimicry rings) vary."

L140: "we": start a new sentence here.

L145: remove ")"

L146: Strength -> strength

L151: higher -> larger

L158: was -> is

L172: each -> a

L174: favors -> favored, slection -> selection

L186: remove ")"

L197: choosen partner -> the chosen partner

L289: step -> steps, check -> checked

L290: value -> magnitude, provide -> provided, insuring -> ensuring

L302: space between rho and =

L303: drop "evolution of"

L308: add "a" before "higher"

L309: remove comma

L310: remove "only"

L319: polymorphism -> polymorphisms

L336: self-referencing mutant -> a self-referencing mutation

L353: architecture -> architectures

Legend figure three: represented -> represent, last use of "proportion" in the legend -> "proportions"

L368: induces -> induce a

L378: produced -> produce

L386: higher -> a larger

L392: to -> for

L411: individual -> individuals

L425: explained -> explain

L426: based -> with (L431)

L445: End sentence after "prediction"

L450: behaviors -> behavior

L480: based on rejection for trait -> with colour pattern recognition and repulsion.

L484: variations in expression controls for -> expression differences determine

L502: The sentence starting on this line is grammatically incorrect. It is unclear what you mean.

L513: influences -> influence

L550: at -> in the

L552: remove one "controlling"

L571: highlights -> highlight, under rejection toward -> when loci code for rejection of

L572: variations -> variation

L578: van -> Van

#### Reviewed by anonymous reviewer, 25 Mar 2020

I enjoyed this manuscript. It describes an interesting model with results that have interesting implications for understanding disassortative mating. The authors have greatly improved the manuscript and have addressed my previous comments. I have just a few additional comments:

1) As far as I understand, the only difference between the two "patches" in the model comes from differences in phenotype specific predation rates. d is the base line predation rate and sigma adjusts that rate. But in some of the figures (e.g. fig 1) the caption says that d = 0 for the simulation. Doesn't this make the patches identical? What is going on here? Perhaps I still don't understand how the patches differ.

2) I believe that Figure S7 presents results from the same simulations as Figure 2, but run for a longer time. Just looking at Fig 2, it looks like there is a threshold (shown be the purple line) that divides parameter space into a region where invasion occurs and a region where invasion does not occur. But I think this is a bit misleading because Fig S7 indicates that invasion happens everywhere if you wait long enough. So my question is, does the purple line on Fig 2 tell us anything useful in the end?

3) There are still a few typos scattered throughout the manuscript. I suggest another thorough read through.

#### Reviewed by anonymous reviewer, 08 Mar 2020

I reviewed this manuscript previously. After the revision, I can follow the set-up of the model much better, although this is still not an easy task. Overall, the manuscript still reads very rough in many places and somewhat unfinished, as I will detail below.

MODEL DESCRIPTION

Eq. (2) is unclear. What is the letter d in the four terms on the right-hand side supposed to mean. At a later point d is introduced as predation strength. But I do not think that this is what the authors mean here. Instead, I suspect that the authors want to refer here to a differential. But why? In this revised version, the model is formulated as an ODE and each term on the right-hand side describes the rate of change in abundance of a genotype in one of the populations due to a different mechanism. I cannot see why a differential d should appear here.

Formulating the model as an ODE requires some implicit assumptions, namely, that generations are overlapping and that each newborn individual instantaneously behaves as an adult individual that can migrate and reproduce. There is no time delay due to an ontogenetic development. I understand that the model is meant as an approximation of real dynamics but I would like to see these conceptual problems being pointed out.

In eq. (3) I do not understand the denominator. Given that Res*[i],[j]=1 for i=j and zero otherwise I understand that the sum only contains a single non-zero term, namely, Res*[I],[i]N_i,pop. So, why the sum? Is this not more complicated than necessary?

In my opinion eq. (4) is not well motivated. It says that individuals of genotype i migrate proportionally to the difference in abundance of this genotype in the two patches. This seems to suggest that individuals could asses the abundance of their genotype in both patches, an idea that I would consider unrealistic. A much simpler and, in my opinion, more realistic idea would be that a fixed proportion of individuals present in each patch migrate. More realistic might, however, be that migration is not random but takes into account some form of habitat matching. I realise that I made the same comment in my previous review. The response by the authors (“We assume a constant migration rate mig i.e. a proportion mig of the population migrates in the other patch. Equation (3) is a result of individuals entering and exiting the patch pop.”) does not address my point.

page 12

In eq.(5) and (6) f*i and Pref*i[i] appear. Neither of these have been introduced in the text so far. Thus, at this point I cannot understand these equations.

In ll. 245 Pref*i,[j] is defined as a preference of one genotype for another genotype. In eq. (7) Pref*i,A is used, where A is a phenotype due to alleles at the P-locus. This seems to be an abuse of notation and does not facilitate reading the manuscript.

The Authors also might want in some place in their manuscript explain clearly their notation. What is the rational for either putting subscripts into hard brackets or not? Also, sometimes two subscripts are separated by a comma and sometimes not.

l. 249 this sentence belongs directly after eq. (7) because that is where P*I,pop appears for the first time.
l. 249 refer to as -> denotes
What is meant by proportion here? Frequency or abundance? For abundance N*i,pop has been used in eq. (2), for frequency f

*i,pop is used in eq. (9). Furthermore, in eq.(2) P*i,pop denotes reproduction. Quite confusing.

More generally, I do not understand the rationale behind eq. (7). What exactly is meant by fertility? What is the logic behind the right-hand side of eq. (7)? I do not see a proper explanation. Is it the proportion of the male population a given female can mate with? What is assumed about female mating behaviour? Can they search for mating partners until they are mated or can they choose only one mating partner, which then maybe suitable or not, per time unit?

In line 239 it is said that f’

*i,pop is a frequency. However, it is not obvious that eq. (9) truly defines a frequency in the sense that the f’*i,pop’s sum up to one. I wonder whether for this to be true a normalisation by the total f’_i,pop would be necessary?

l. 261 Here, the authors refer to the appendix for an explanation of coef(i,j,k,rho). Without further explanation I find it very difficult to understand the formula for coef_haplotype that is given in the appendix. First, no explanation is given for the set-theoretic notation that is used here. Second, no explanation is given for the different terms on the right-hand side. As a reader of a manuscript I do not want to be a detective.

NUMERICAL METHODS

I find the numerical methods explained in insufficient detail.

First, how is it assured that the dynamics reach a fixed point equilibrium. Is it always so that simulations run for 10000 time steps do show no further change? In several of the results I wonder whether the numbers presented in the figures truly represent frequencies at a fixed point equilibrium or whether these frequencies are still transient.

Second, how is it assured that, if a fixed point equilibrium exists, it is unique? In other words, how is it assured that the dynamics does not depend on the initial conditions?

Third, in the legend of fig. 3 it is mentioned that the dynamics are first run for 10000 time units with random mating before the other mating alleles are introduced. Why this procedure? If the dynamics would have a unique equilibrium, then it should not matter whether one starts the simulation with all alleles present simultaneously at both loci.

Fourth, I wonder whether 10000 time units is a “long time” or a “short time”. To how many generations does this correspond? It is clear that this is not easy to answer this but methods to estimate generation time for overlapping generations do exist. To the extend that the results are transient an interpretation based purely based on the number of time units is difficult.

Fig. S5 shows the temporal dynamics of a the mutant allele dis. I am not sure about the meaning of the scale on the y-axis but what this graph suggests to me that 100 time units is a very short time span since the P_mut seems to grow exponentially without any signs of slowing down.
Fifth, this is very parameter rich model but only a relatively small range of the parameter space seems to have been explored. Specifically, the parameters d, sigma, lambda r, and K have not been varied. I understand that this are chosen such that these result in disruptive selection at the p-locus. Is there another justification for not varying them?

RESULTS

Fig. 1

For this figure d=0 and delta=0 for all deltas. In other words, in these simulations mortality is absent, individuals can reproduce and migrate but never die. This means that once the carrying capacity its reached in both patches no more population wide changes in genotype frequencies can occur since reproduction comes to a halt. Investigating such a case seems a very odd choice. I would expect that the final gene frequencies are highly sensitive to initial conditions and do not reflect a unique equilibrium fixed point. This analysis seems close to irrelevant to me.

p. 18-19 and Figs. S7 and S8

The legend for figure S7 seems to be wrong. Also, Fig. S7 does not seem to show what is mentioned in the main text, at least, I cannot read from Fig. S7 that widespread fixation of the allele dis occurs.

Figure S8 shows that for higher costs (cost=0.25) in most of the parameter space the frequency of the allele dis drops below the initial frequency 0.01. Thus, I would disagree with the authors who state “Simulations assuming different costs associated with choosiness (cost) show a similar effect of associated genetic loads, although increasing this cost slows down the invasion of the choosy disassortative mating mutant dis (see Sup. fig. S8).” (ll. 345-347)

In the legend of figures S7 and S8 the value of the parameter delta is not given. Is figure S8b supposed to be identical to figure 2? It is not clear to me whether the two alleles r and dis at the mating locus can coexist long term for some parameter combinations.

Why are in fig. 2 the frequencies computed for only 100 time units? As mentioned above, according to fig. S5, this seems rather short.
For fig. 2 I miss information about the frequencies of the different color pattern alleles. As I explained above, the results from fig. 1 are uninformative for this case.

p. 20-22

In figure 3b the results seem to be identical for all delta*i>0. This seems rather surprising to me. If it is really true, it needs an explanation. Similarly, in fig. 3a the results seem to be identical for delta*i>0.3. Given that in figure 3a the results for delta*i=0.1 - 0.3 are clearly different from those for delta*i=0.4 - 1 I find it very surprising that the results in figure 4a, which in my understanding are derived from those in figure 3a, hardly differ for different deltas_i>0. This does not seem to be discussed but in my opinion, if true, requires an explanation.

p. 23, ll. 408-414 and Fig. S10

These results are quite incomprehensible. Fig. S10 shows results that in my understanding are based on simulations shown in figure 2 (and S7, although I am unsure whether S7 is the correct figure). In ll. 342-343 it is mentioned that over longer evolutionary time the dis allele is fixed (something that is in fact not visible from figure S7). If this were to be the case, I wonder whether it is really possible that half of the population acts according to self-acceptance and half according to self-avoidance? Furthermore, fig. 2 (and fig. S7) show a clear gradient for increasing delta*i. In contrast, fig. S10 shows that results do not change with increasing delta*i for delta_i>0. Are these results compatible? I recommend to add a figure in the style of fig. 3 accompanying fig. S10.

p. 24-26 and figs. S10, S11 and S12

Several results in this section are difficult to understand.

First, in l. 434 the authors write that recombination further promotes the invasion of the allele dis. I wonder what that means given that the authors write in l. 343 that over long time span the allele dis reaches fixation (without recombination). Do they mean that the speed of invasion is increased? If this is the case, then I come back to my earlier question to what extend the presented results show transient dynamics and to what extend they show equilibria?

A comparison between the column corresponding to delta_i=0.5 in fig. S10 and fig. S11 shows that any positive recombination rate completely changes the result compared to the case with no recombination. Again, it is not clear to me whether this concerns the outcome at a stable point equilibrium or transient dynamics as observed after 10000 time units? Naively, I would expect that also in the simulations shown in fig. S10 the allele dis becomes fixed and that therefore in the long run the results with and without recombination should be equal to each other. If this is indeed true, it should be clearly pointed out, and if it not true, then the reason should be made clear.

In fig. 5, as in many place before, it is not clear to me whether we are here looking at transient dynamics or frequencies as they are obtained when the system has reached a point equilibrium. Is it really such that in fig. 5b a different point equilibrium is reached for rho=0.1 and 0.2 than for rho>0.2. Naively, I would expect that rho affects the speed of evolution but not so much the equilibrium frequencies. This needs to be clarified.

Fig. S12 shows figures analogous to fig. 3. While in fig. 3 each column consists of two sub-columns, one for each habitat, this is not the case in figure S12. Why? I think something went wrong with the labelling along the x-axis in this figure.

MINOR COMMENTS

Generally, the writing still contain many small mistakes. The below is such a small selection.

l. 3 acting on -> with respect to

ll. 7-8 on mimetic color pattern -> either: on a mimetic color pattern, or: on mimetic color patterns

l. 97 promotes -> promote

l. 116 track down -> trackv

l. 151 decreased by 1-sigma or decreased by sigma?

l. 186 delete ) at the end

l. 190 Why two alleles. The following text specifies three alleles: r, sim, dis.

l. 195 I do not understand this explanation

l. 197 of choosen partner -> of the chosen partner

l. 267 I wonder whether Mortality would not be a more appropriate header for this section. After all, eq. (12) seems to describe a reduction in population size due to mortality.

No explanation is given for the functional form on the right-hand side of eq. (12). It seems here that baseline survival and survival due to genetic load act multiplicatively. Can this be motivated on biological grounds? I would have found it at least as intuitive if survival would have been given by 1-delta-delta_i, thus that the two mortality factors act additively.

l. 277 genetic densities -> genotype densities

l. 278 (see first Equation) -> (see equation (2))

ll.277-279 Why does tracking genotype densities at a population level makes it an obvious choice that the four demographic processes occur at the same time? What do not see the logic behind the construction of this sentence.

l. 285 range values -> maybe better: parameter intervals

l. 288 drawn from -> taken from

l. 289 and use discrete time step -> and by using discrete time steps

l. provide -> provides

l. 341 slighter -> weaker or smaller

l. 444 I do not follow why what is said in the previous sentence implies (therefore) what is said in this sentence.

l. 447 under attraction rule -> under the attraction rule

l. 480 for trait -> for the color trait (?)

*Evaluation round ***#1**

**#1**

*30 Dec 2019*DOI or URL of the preprint: ** https://doi.org/10.1101/616409**

Version of the preprint:

#### Author's Reply

#### Decision by **Charles Mullon**

Dear authors,

I now have received three expert reviews on your preprint. As you will see below, all reviewers agree that your manuscript tackles an interesting question. However, they also all raise fundamental issues with your mathematical model, making it difficult to interpret your results. Having read your manuscript several times myself, I concur with the reviewers’ assessments. Like them, I had troubles understanding the biological basis of your equations and could not clearly connect your dynamical model with the life history of individuals, from migration to reproduction to death.

I therefore ask that you revise the manuscript to account for the reviewers' comments Please also provide a detailed point-by-point response to all comments. I will then be able to further consider the preprint for recommendation.

In addition to the reviewers' specific comments, I have one of my own. In equation (2), I find it difficult to reconcile the two ways you model the effect of resemblance to local community (via the parameter sigma) and to conspecifics (via lambda) on predation. Both of these effects depend on the same ability of predators to associate particular characteristics to chemical defenses. So how can you model the effects of resemblance to community and conspecifics on predation independently of one another?

I look forward to receiving your revised manuscript.

Best regards,

Charles Mullon.

#### Reviewed by **Tom Van Dooren**, 07 Dec 2019

This is my review of "Evolution and genetic architecture of disassortative mating at a locus under heterozygote advantage" by Maisonneuve et al. The manuscript fits in a series of papers investigating frequency-dependent selection and local adaptation. Novel here is the focus on effects and maintenance of assortative or disassortative mate choice. The manuscript provides arguments to investigate mate choice and genetic architecture in Heliconius numata. I recommend revision. The equations are incomplete, making this manuscript difficult to read as a stand-alone paper. The English needs to be improved as well. I list a number of changes to make below, but there are many more similar errors.

Line 146 here a first matrix appears, I wonder if it is possible to write out an equation for the population dynamics and maybe also for the dynamics of mutants as sums and products of matrices. That would result in a more concise notation, which is remaining readable for several parallel processes even. Then you might also be able to mobilize tools from linear algebra to derive partial analytical results.

Figures 3 and four: the frequencies of haplotypes at equilibrium seem little dependent on the genetic load, as long as it is non-zero. Please work this out further if you can, this relative independence hints at an analytical result.

Specific comments:

L7: I find the juxtaposition of "positive frequency-dependent selection and positive selection confusing. L46: This statement on P. microlepis is outdated. Please consult Lee et al. 2010. https://doi.org/10.1111/j.1095-8649.2010.02620.x L89: is linkage disequilibrium required for the evolution of assortative mating or not? Maybe the statement to make here is stronger than “favored”. L118: local directional selection. Please explain. Please read the manuscript again to correct the numerous gallicisms: L126: two population model L127: spatial variation L161: do your assumptions on sigma imply that there are certain symmetries imposed between the two linked populations? This seems confirmed by L171. L173: why not just use d as the other mortality coefficient? L185, equation (2): it is difficult to see which assumptions are made regarding unpalatability. Unpalability figures in the denominator, so when it becomes larger, mortality seems smaller. Is there a further reason for this way of modelling, next to convenience? L187: Res_{[i],[pop]} needs to be explicitly written out. L191: shouldn’t it be denominator? L192: is toxicity the same here as unpalatability? Please correct or clarify. L221: write “a” italic, to recognize it as an allele. L223: emerges L231, 232: are these population statistics which will be calculated? How will that be done? L234: ”inferred behavior can be compared with estimated mate preferences”: please explain how that calculation would work? L241: between both loci L249: initially, do you mean this matrix will change over the course of a simulation? L256: can cost take on any value between zero and one? L260: can you please make explicit which function coef() is? L263: the normalization is unclear, why the "for All i" when there is no index i in the expression? The frequency f (L243) is a direct ratio of frequency changes (L244)? How does that work? L267: limit L276: are you now mimicking an ODE or is this a true discrete time system? What is the supposed length of the time interval t relative to generation time? L284: if all events occur simultaneously, then individuals can both reproduce, be predated and migrate at the same time? Please prevent this. L319: was then set L423: decreases the proportion of individuals performing self avoidance at equilibrium: I find the change very modest in fig. 5a, as soon as genetic loads are non zero. Discussion L442: more likely to emerge with self -referencing. Please work this out more. In the results text the arguments for this seem more fragmented. In the discussion on L468, you sketch a more nuanced picture already. L471: this sentence is strange. “role” “on mate choice” ? What do you mean? L481: “our theoretical” words are missing L481: has L484: variation L530: predicted to involve haplotypes triggering rejection: this statement is too strong and stronger than what you wrote on L468. Please mollify

#### Reviewed by anonymous reviewer, 06 Dec 2019

In this manuscript the authors aim to model the evolution of disassortative mating. Their model is heavily inspired by the biology of Heliconius butterflies. While this might in principle be an interesting task the present model contains many conceptual oddities as I will detail in the following. As a result, I am very skeptical about the value of this analysis.

The authors aim to set up a discrete time two-locus population genetics model, i.e., a recursion that projects the number of individuals for the different genotypes from one time stop to the next. Unfortunately, the length of the projection interval and how it corresponds to the life history of Heliconius butterflies is not clear to me. Equation (10) describes how the authors envisage that the population size changes over one time step. The right-hand side of this equation consists of the sum of four terms. First, a term delta P that describes the change in adults (at least, that is what I think since eq. (2) detailing this term speaks about predator avoidance). Second, a term delta R describing reproduction (detailed in eq. 7). Third, a term delta M describing migration between two populations (detailed in eq. 3). Fourth, a term delta S describing larval survival (detailed in eq. 9), which is supposed to depend on the genotype due to accumulated deleterious mutations at the locus determining the phenotype. This equation does not make sense to me at a very fundamental level. Why does population change depend on the sum of these four terms? For the authors the explanation is: “We use discrete time simulations where all events (reproduction, predation and migration) occur simultaneously, therefore relevantly stimulating a natural population with overlapping generations.” (p. 14, ll. 283-285). However, modeling a structured populations with overlapping generations requires to follow the densities in the different stages over time and therefore a matrix approach a la Caswell (Matrix Population Models, 2001) in which the abundance of individuals in each stage (larvae, adults before survival and before reproduction, adults after survival at reproduction) is modelled by a separate recursion equation. Currently, each of the four terms depends on the right-hand side of eq. (10) depends on N^t_i,pop, the number of individuals of genotype i in population pop at time t without specifying in what life stage these individuals are. But this matters because the number of larvae and adults cannot be lumped together because the fate of individuals depend on their current state in the life cycle. Said differently, for an individual to complete its life cycle it has to first survive as a larvae, then as an adult and then it can reproduce. Thus, the contribution of an individual to population growth and allele frequency change depends on the product of larval survival, adult survival and fecundity and this is not reflected in the current model formulation.

There are further problems in the formulation of the equations detailing the contributions on the right-hand side of eq. (10). First, eq. (2) is meant to describe the change in population size due to predation (which, I suppose, is acting on adult butterflies). In my view this equation contains several oddities. First, this equation contains the ”mortality coefficients” (l. 171) d(1-sigma) and d(1+sigma). But what are these factors really? Since the model is formulated in discrete time I would expect that they describe the probability to die over one time step. However, since d(1+sigma) can be a number larger than one they cannot be probabilities. As a results, I am uncertain about the meaning of these coefficients. I am also confused by the meaning of the factor Res[i],[res]. In line 187 it is defined as the resemblance of the phenotype expressed by genotype i to the local community. However, on the previous page it is explained that the local community itself is polymorphic as determined by the parameter sigma. Thus, it appears to me that Res[i],[res] is not well defined. Furthermore, the authors explain that the predation risk is also determined by the frequency of the different phenotypes of the focal species in a patch and this is somehow accounted for by the denominator on the right-hand side of equation (2). I find it very unintuitive that the effect of the “background population” and that of the focal population on predation risk are modelled in two different manners. From the perspective of a focal individual I would expect that all that matters is how many other individuals of the same phenotype exist in a local population where it does not matter whether these individuals belong to the same or a different species. Finally, I note that the whole expression on the right-hand side of equation (2) without the final factor N^t_i,pop can be less than -1. This is the case when Res[i],[pop]=0, d is close to one and lambda very small. In this case, this factor equals approximately -d(1+sigma)<-1. This implies that more individuals can disappear from the population due to mortality than are currently present. This should not be possible in a well formulated model.

Second, equation (3) describes migration and simply states that, for each genotype, migration occurs from the larger population to the smaller. This assumption requires that individuals have perfect knowledge about the abundance of their own genotype in both populations, something that I consider biologically unrealistic. I also notice that for their simulations the authors assume sigma=0.5 (see Table 1). In my understanding of the model this means that the patches are actually symmetric. As a result, I would expect that the allele frequencies will be equal across patches, resulting in the absence of migration. I am not sure whether this is the author’s intention?

#### Reviewed by anonymous reviewer, 18 Dec 2019

This manuscript reports on the development and exploration of a model testing the evolution of disassortative mating. The model results in a number of interesting outcomes depending on the genetic architecture and specific parameter values used and provides potentially testable hypothesis for mechanisms maintaining disassortative mating observed in some natural systems. On the whole, I think this is a well-constructed and interesting study, however, there are a few key things that I’d suggest need a bit more clarity. I provide my detailed comments below:

Lines 159-163: Here, the authors introduce the parameter theta (representing spatial heterogeneity), however I’m not sure I understand how it is defined. The manuscript describes it as “the relative proportion of phenotype [A] and [B] in mimicry rings of patch 1 and 2 respectively”. I understand this to mean the proportion of [A] in patch 1 and the proportion of [B] in patch 2. The authors set this parameter to 0.5 (Table 1), which appears to mean that both patches have 50 % [A] and 50 % [B] in their local mimicry rings. Perhaps I misunderstand, but does this mean that both patch 1 and 2 are the same?

Line 243: There is a reference to “equation 1.4”. What is this?

Equation 7: Is the last part of this equation correct? -> N^t*i,pop f^f+1*i,pop . I would have thought it should actually be: N^t*tot,pop f^f+1*i,pop

Lines 286-292: This paragraph is a repeat of the info Table 1. It seems unnecessary.

Lines 298-300: This sentence indicates that polymorphism is maintained with migration at an intermediate rate. But figure 1a seems to indicate that polymorphism is maintained at all tested migration rates greater than 0.

Figure 1: All the panels are incorrectly titled “random mating”.

Line 334: In the paragraph beginning here, the authors discuss invasion from rare tests. In these tests, was the population allowed to reach some kind of equilibrium before adding the mutant at low frequency and watching its fate.

Figure 2: The legend box axis is missing its label.

Figure 3: The results shown in fig 3b are quite confusing. At a high genetic load, two allele types are shown to coexist in patch 1, the two types are c-Ma and b-Mb alleles. B-Mb alleles make sense as they are disassortative, but the existence of c-Ma doesn’t make sense to me since there are no [A] phenotypes in that patch to begin with. Can the authors speculate on what is going on here?

Lines 493-498: The authors outline a couple of examples of polymorphism driven by disassortative mating and the presence of a strong genetic load. Do these studies discussed have measures of polymorphism? I.e. is it intermediate? Highly skewed? I’d like to gauge whether the level of polymorphism seen in this paper is similar to what’s seen in natural populations.

Line 501: I’m not sure I understand this idea here. Do the authors mean “recessive homozygotes” as opposed to “dominant homozygotes” here?