The Complexity of Interactively Learning a Stable Matching by Trial and Error

2020·Arxiv

Abstract

Abstract

In a stable matching setting, we consider a query model that allows for an interactive learning algorithm to make precisely one type of query: proposing a matching, the response to which is either that the proposed matching is stable, or a blocking pair (chosen adversarially) indicating that this matching is unstable. For one-to-one matching markets, our main result is an essentially tight upper bound of ) on the deterministic query complexity of interactively learning a stable matching in this coarse query model, along with an efficient randomized algorithm that achieves this query complexity with high probability. For many-to-many matching markets in which participants have responsive preferences, we first give an interactive learning algorithm whose query complexity and running time are polynomial in the size of the market if the maximum quota of each agent is bounded; our main result for many-to-many markets is that the deterministic query complexity can be made polynomial (more specifically, in the size of the market even for arbitrary (e.g., linear in the market size) quotas.

1 Introduction

Problem and Background In the classic Stable Matching Problem (Gale and Shapley, 1962), there are n women and n men; each woman has a subset of men she deems acceptable and a strict preference order over them; similarly, each man has a subset of women he finds acceptable and a strict preference order over them. The goal is to find a stable matching: a one-to-one mapping between mutually acceptable women and men that contains no blocking pair, i.e., no pair of a woman and a man who mutually prefer each other over their current partner in the matching (or over being unmatched, if one or both are unmatched). In their seminal paper, Gale and Shapley (1962) constructively proved that such a stable matching always exists.

The study of stable matchings within computer science was originated in Knuth’s famous series-of-lectures-turned-book (Knuth, 1976), in which Knuth focused on the stable matching problem as — according to his foreword to the book — an ideal introductory problem for the analysis of algorithms. In this book, Knuth’s first order of business was that the most natural “first attempt” at an algorithm for finding a stable matching fails: if one repeatedly tries to resolve blocking pairs by matching the two participants of the pair with each other and matching their previous partners with each other, this may lead to a cycle.

Some decade and a half following Knuth’s book, Roth and Vande Vate (1990) showed a positive result for a close variant of the algorithm, in which a blocking pair is resolved by matching the blocking pair with each other and leaving their previous partners unmatched; we refer to this algorithm as the myopic algorithm.Roth and Vande Vate showed that if the blocking pair to be resolved is chosen uniformly at random among all blocking pairs (rather than adversarially), then with probability one, the algorithm terminates in finite time, and thus yields a stable matching. Roth and Vande Vate’s result was later generalized by Kojima and ¨Unver (2006) to the canonical “many to many” extension of the stable matching problem, in which each participant on each side of the market (the two sides of the market in this setting are customarily called firms and workers) can be matched to more than one participant on the other side, (e.g., up to a given

Another decade and a half later, Ackermann et al. (2011) focused on the number of iterations it takes the myopic algorithm to reach a stable matching, proving the following negative result: for some preference structure, while convergence with probability one is assured, it may take exponentially many iterations to reach a stable matching. Their result applies already in the one-to-one setting, and thus of course extends to the many-to-many case as well.

Through the lens of learning theory, we can view this type of interaction as corresponding naturally to Angluin’s Equivalence Query model (Angluin, 1988) for learning a binary classifier, and its recent generalization due to Emamjomeh-Zadeh and Kempe (2017). In this interaction model, the learner proposes a candidate structure (here: a matching), and — unless the proposal was correct (here: stable) — learns of one local mistake (here: a blocking pair). Because the queries are very coarse-grained compared to standard preference queries — after all, a query comprises an entire matching and cannot target a specific comparison or even a specific participant — we refer to the query model as the coarse query model. Viewed through this lens, the preceding results state that in the coarse query model, the myopic algorithm could take exponentially long to converge given randomly drawn (legal) responses, and might never converge given adversarial (legal) responses. Taking this learning-theoretic perspective as our point of departure, in this paper, we ask whether more sophisticated interactive learning algorithms in the coarse query model can be made to perform much better than the myopic algorithm, and if so, precisely to what extent.

Beyond a learning-theoretic interest, our main question also has implications from a “purely EconCS” perspective. A famous quote by Kamal Jain (see, e.g., Papadimitriou, 2007) questions the validity of market equilibria as a practical solution concept on the grounds that “if your laptop cannot find it, neither can the market.” This quote refers to a line of work that eventually culminated in showing that even finding approximate equilibria is PPAD-complete (and thus likely to require an exorbitant number of queries), not only in a decentralized manner, but even by a capable central planner. Most famously, this was shown for Nash equilibria (Daskalakis et al., 2009; Chen and Deng, 2006; Rubinstein, 2016), and for Arrow-Debreu market equilibria (Chen and Deng, 2006; Chen et al., 2009, 2017; Rubinstein, 2018), even with severely restricted utility functions. A negative answer to our main question would therefore cast doubt on the possibility that a market designer with access only to coarse queries such as those of “trial and error”could find a stable matching without the ability to ask each participant detailed questions about their preferences (like those that the Gale-Shapley algorithm requires). On the other hand, thinking of a blocking pair as “minimal informative evidence” of the instability of a matching, which is arguably what is naturally observed as a first sign of a matching being about to unravel, a positive answer to our question would mitigate to some extent the negative message of Ackermann et al. (2011), providing evidence that stability can in fact be achieved even by a market designer looking at the market through a far coarser and less informative lens than usually considered.

One-to-one markets We start our analysis in the one-to-one setting of men and women, and consider adversarial responses to the queries of our algorithm. (The model is defined precisely in Section 2.) A trivial observation is that an exponential-query brute-force search over all matchings can find a stable matching. Some further consideration gives rise to the following algorithm: at every step, fix speculative preferences for all participants; the only requirement is that these speculative preferences must be consistent with all of the information gathered so far about the participants’ preferences. (In the first step, any preferences are consistent, as no information has been gathered yet.) The algorithm then proposes a matching that is stable with respect to these speculative preferences. If this matching is unstable, then the response to the query tells the algorithm that some pair (w, m) is blocking; that is, that w prefers m to her assigned partner and that m prefers w to his assigned partner. At least one of these two pairwise comparisons must contradict the speculative preferences and so constitutes new information upon which to base the new speculative preferences in the next step. Noticing that throughout the run of the algorithm, the algorithm can collect at most) pairwise comparisons for each participant’s preferences, we conclude that this algorithm makes at most ) unsuccessful queries, and therefore finds a stable matching in ) queries (and in polynomial running time).

The above cubic algorithm essentially attempts to “sort” the preferences of each participant (unless it is “interrupted” by finding a stable matching). In the worst case, it uses quadratically many of its queries to “sort” each participant’s preferences. While finding a stable matching in polynomially many queries is already appealing, encapsulating a quadratic-query sort within this algorithm is dissatisfying. The challenge with replacing this “inner sort” with an O(n log n)-query sort is that unlike standard sorting algorithms, in the coarse query model, the next comparison is chosen not by the algorithm, but adversarially. It is therefore a priori unclear whether the query complexity of each such “inner sort” can be improved. Our first main result, proved in Section 3, is a positive answer to this question: it is possible, at the beginning of each step of the algorithm, to choose the speculative preferences in a way such that no more than O(n log n) comparisons are ever made for any one participant, implying the following theorem:

Theorem 1.1 (Informal, see Theorem 3.4In the coarse query model, the deterministic query

complexity of finding a stable matching is

The main combinatorial tool that we leverage in our proof of this theorem is Yu’s remarkable Proportional Transitivity Theorem (Yu, 1998). Implementing the resulting strategy for finding the “right” speculative preferences to fix is computationally non-trivial; indeed, note that Theorem 1.1 only expresses guarantees about the number of rounds of interaction, but not about the computational complexity. Fortunately, a Markov chain-based algorithm due to Huber (2006) for sampling linear extensions of a partial order can be leveraged to obtain “good” speculative preferences effi-ciently (i.e., in running time polynomial in n), up to negligible probability of failure:

Theorem 1.2 (Informal, see Theorem 3.9In the coarse query model, a stable matching can be efficiently found with high probability using

Can this quadratic (up to the log factor) query complexity be improved? The existing literature on the query and communication complexity of stable matchings (Ng and Hirschberg, 1990; Chou and Lu, 2010; Segal, 2007; Gonczarowski et al., 2019) provides some very robust lower bounds on the complexity of finding a stable matching. However, these are all obtained in a model in which each query is to the preference list of one participant, or more generally involves only preference lists of participants on one of the sides of the market.The most general of these lower bounds (Gonczarowski et al., 2019) states that Ω() queries are required to find a stable matching. While queries in that model are very flexible, the theorem of Gonczarowski et al. (2019) that proves this bound also proves that in that model, even verifying the stability of a proposed matching requires quadratically many queries. This is in contrast with our coarse query model, in which verifying the stability of a proposed matching takes one query. In general, though, queries in any of the models of those papers are more powerful and flexible than in our coarse query model, in the sense that in those models, a query can specify, for example, a precise pairwise comparison to be revealed. By contrast, queries in our coarse query model are answered adversarially.Given these differences between the models, it is quite curious that in both models, the best known upper bound (up to log factors) is quadratic. Completing the picture, we present a quadratic lower bound in our coarse query model as well:

Proposition 1.3 (Informal, see Proposition 3.10In the coarse query model, the query complexity of finding a stable matching is Ω(. This holds even for the number of expected queries by a randomized algorithm for surely finding a stable matching when the preferences of participants on one side of the market are identical and known, and the preferences of the participants on the other side are i.i.d. uniform.

The proof of this lower bound is less elaborate than those of the quadratic lower bounds of the above-mentioned papers. This provides further evidence regarding the weakness/coarseness of the queries that are allowed in our model, in some sense making the ) upper bound more surprising. The remaining gap between our upper and lower bounds, while in a very different model than that in which Gonczarowski et al. (2019) presented their gap, is between the exact same bounds of Ω(). Both gaps, in very different technical senses, may be thought of as leaving open the interesting question of whether the number of “atomic” queries required to find a stable matching can be significantly less than the Θ() bits of information that are needed to encode the preference lists of all participants.

Many-to-many markets Finally, in Section 4, we generalize our results not only to many-to-one markets, but in fact “all the way” to many-to-many matching markets (Roth, 1984). We consider a market with firms on one side and workers on the other, where each participant has responsive preferences (Roth, 1985). That is, each participant has a strict preference order over the participants on the other side of the market, as well as a quota: the maximum number of participants from the other side of the market to which this participant can be matched. The coarse query model that we consider in this setting is similar to the coarse query model from the one-to-one setting: an interactive learning algorithm may propose a matching, and the response is either that this matching is (pairwise) stable, or a blocking pair. However, the revelation of a blocking pair only reveals that the firm prefers the worker to some worker it is currently matched with (but does not identify this worker), and that the worker prefers the firm to some firm she is currently matched with (but does not identify this firm). Thus, the information given to the algorithm is even coarser than in the one-to-one matching case. It would of course also be natural — and the appropriate choice for some applications — to consider a model that reveals the discarded firm/worker; as we discus in Section 3.6, our analysis from the one-to-one case applies to such a model without any need for adjustment. We focus on the even coarser model that does not reveal the discarded partners in order to keep in line with our motivation above: focusing on the weakest/coarsest model in which the algorithm still naturally obtains information about the underlying preferences.In this model, a single response does not definitively imply even a single pairwise comparison in the preferences of any participant. It therefore becomes much less clear whether or how exponentially many queries (and exponential running time) can be avoided. Indeed, this problem bears a considerable resemblance to a problem that we show to be NP-hard (see Proposition 4.2). Some further thought, though, shows that the above ) algorithm for the one-to-one setting can be conceptually generalized to obtain an algorithm, where maximum quota of any participant:

Theorem 1.4 (Informal, see Theorem 4.3). In the coarse query model, the deterministic query complexity of finding a stable matching is . Furthermore, a stable matching can be found in time using such a number of queries.

For a market with small (constant) quotas, this theorem is satisfactory. For markets in which at least one quota is large (e.g., superpolylogarithmic in the size of the market; for instance, where at least one employer employs a sizable chunk of the market), the query complexity of this theorem is too high, a result of the significantly coarser information the algorithm receives. Our main result for the many-to-many setting is nonetheless a positive one: it is possible to once again carefully set the speculative preferences in each step of the algorithm to obtain a polynomial number of queries:

Theorem 1.5 (Informal, see Theorem 4.5). In the coarse query model, the deterministic query complexity of finding a stable matching is (regardless of the maximum quota q).

This theorem makes no guarantees regarding the running time of an algorithm that makes this number of queries. Indeed, the main problem that we leave open is designing an interactive algorithm for many-to-many markets that runs in time polynomial in n, regardless of q. As discussed in Section 5, we identify intimate connections between this open problem and open problems regarding rapid mixing of certain stochastic processes under constraints that cease to be convex when q is greater than one.

Open Problem 1.6. For the coarse query model, construct an interactive learning algorithm that finds a stable matching in many-to-many, or even many-to-one, markets with an expected polynomial number of queries and an expected polynomial run time.

As mentioned above, the coarse learning model we study can be viewed as a case of the generalized equivalence query model of Emamjomeh-Zadeh and Kempe (2017). Note that the structures to be learned in our setting are the agents’ preferences, not the matching itself; the proposed matchings serve as conduits for receiving additional information about the preferences. Emamjomeh-Zadeh and Kempe (2017) provided a general approach, based on a technique of Emamjomeh-Zadeh et al. (2016), for learning such structures in a number of rounds that is logarithmic in the number of candidate structures. Had this technique been applicable, their technique would have implied our ) bound from Theorem 3.4. However, their framework is provably not applicable to our setting, necessitating the self-contained algorithm and analysis we provide here. We prove this fact in Appendix A.

After this paper appeared at EC’20, the authors of (Bei et al., 2013) contacted us (personal communication, August 20, 2020) and made us aware of their paper, of which we had been unaware until that point. As they pointed out, in their paper, they asked a similar question to ours, for a number of domains including one-to-one matching markets, and proved theorems equivalent to our Theorems 3.4 and 3.9 and a theorem similar to our Proposition 3.10. For Theorems 3.4 and 3.9, while of the same order, the number of queries that our algorithms use is smaller by a constant factor, and our proofs are somewhat shorter, since we used Yu’s Proportional Transitivity Theorem, whereas they proved the result from first principles; however, neither this difference nor the differ-ence in proofs is substantial. For Proposition 3.10, our result is somewhat stronger, by allowing preferences on one side to be known and identical, and preferences on the other side to be i.i.d. random, whereas theirs is a worst-case result; however, the qualitative message is similar. Bei et al. (2013) did not study many-to-one or many-to-many matching markets in their paper, and so our results for these markets, including Theorems 4.3 and 4.5, have no counterparts in their paper.

2 Preliminaries

2.1 The Stable Matching Problem

In a general many-to-many matching market, there is a set workers and a set firms. We will use w and its variants exclusively for workers and f and its variants exclusively for firms. We write for the set of all agents, and a and its variants when we define or reason about notation that applies equally to both sides of agents. Each worker and each firm . Each worker has a preference order (linear order) over a subset of F, which we interpret as the subset of firms to which w prefers being matched over being unmatched. We write to denote that w (strictly) prefers , and write to denote that w prefers f over being unmatched. Similarly, each firm has a preference order over a subset of W.

A many-to-many matching is a set of worker-firm pairs such that the set of firms to which worker w is matched, , is of size at most , and the set of workers to whom firm f is matched, , is of size at most

In the special case in which = 1 for all = 1 for all f, this is exactly the definition of a standard (not necessarily perfect) one-to-one matching. In this case, keeping with standard nomenclature, we refer to the two sides as women W and men M, using the notation w and m and their variants for individuals. In this case, we also write ) for the (unique) partner of a is unmatched, then we define

Given a matching if she is matched to an agent that she does not prefer over being unmatched. A matching is individually rational if no agent is individually blocking. Given a matching is called a ) if all of the following hold:

•

• Eitheror there exists ) such that

• Eitheror there exists ) such that

A blocking pair would prefer to add each other to their respective sets of matches (while removing from these respective sets if these sets are full). A matching is (pairwise) stable (with respect to the preferences ) if it is individually rational and no pair is blocking it. That is, if is stable, then it is individually rational, and furthermore, for each pair (, we have eitheror); in words, a matching is stable iff each agent is only matched to agents that it prefers over being unmatched, and furthermore for each unmatched pair (w, f), either w is matched to her quota of partners and prefers all her current partners to f, or f it matched to its quota of partners and prefers all its current partners to w.

2.2 The Interactive Learning Model

Our goal is to design an algorithm to produce/learn a stable matching. The challenge is that are unknown to the algorithm; nonetheless, the algorithm seeks to find a stable matching with respect to them, by gradually learning about the individuals’ preferences. The learning proceeds in rounds. In each round t, the algorithm proposes a matching is stable with respect to the (initially unknown) , the process terminates. Otherwise, the algorithm learns about one blocking pair ) or one individually blocking agent. If there are multiple possible blocking pairs (and/or individually blocking agents), the one that is revealed to the algorithm is chosen adversarially. Based on this new information, the algorithm can then compute a new proposed matching , and so on.

2.3 Reduction to Full Preference Lists

We will say that a matching market has full preference lists if a) All preference lists are full: each agent prefers every agent on the other side of the market to being unmatched, and b) Quotas are balanced: . (In a one-to-one market this translates to say that a matching in such a market is perfect if every participant is matched to her full quota of partners. In such markets, a matching is individually rational if and only if it is perfect. Restricting attention to perfect matchings, a matching is stable if and only if it is not blocked by any pair. When stating our learning algorithms and analyzing them, it will be convenient and reduce clutter to restrict attention to markets with full preference lists. Since we will design our algorithms to only output perfect matchings, we will only have to deal with blocking pairs (rather than individually blocking agents) as responses. Luckily, by a well-known reduction in stable matching theory, this restriction is without loss of generality. See Appendix B for the details of the embedding of the general problem in the special case of full preference lists, as well as for a discussion and a word of caution on applying other well-known reductions within the context of our coarse query model.

2.4 Additional Notation

When = 1 for the blocking pair (), the algorithm can infer that and ). In the more general case, the algorithm only learns that there exist ) such that . In Section 3, we will briefly discuss a model in which the algorithm also learns the exact identity of ; unsurprisingly, in this case, the problem directly reduces to the case in which = 1 for all

In either case, the algorithm obtains information about the agents’ preferences in the form of “a must precede at least one element of S in a particular preference order.” We write such constraints as (the agent to whose preference order the constraint applies will be clear from context), and also as is a singleton. We will consider sets C of such constraints. We say that a ranking if it satisfies all constraints in C. Throughout, all of our sets C will be such that at least one ranking (namely, the unknown true ranking) is consistent with C. When C is a set consisting only of constraints of the form , i.e., involving only the relative order of two elements, then C precisely defines a strict partial order, and the rankings consistent with C are exactly the linear extensions of C.

3 One-to-One Matchings

In this section, we focus on the case in which all agents have a quota of 1, i.e., the case of women and men. We will write . Recall that in this special case, the revelation of a blocking pair ) implies that ); this information reduces the number of (remaining) possible preferences, by reducing either the number of possible preference orders or the number of possible preference orders

3.1 High-Level Overview

The high-level idea of our approach is to choose in such a way that whichever blocking pair the adversary reveals, the number of possible preference orders for at least one of w, m not only decreases, but in fact decreases by a constant factor.

For each agent be the set of all preference constraints that can be inferred from the blocking pairs revealed before (but not including) time t. That is, for a woman w, we have ; similarly for men. Let denote the set of all linear extensions of

We call a scenario (at time t) a vector giving for each woman and each man a preference order consistent with all observations so far; that is, a vector (denote the number of remaining possible preference orders for the total number of scenarios. Initially, for every agent a, there are ! possible preference orders, so

Our algorithm will ensure that the number of possible preference orders for at least one agent, and thus the number of possible scenarios, decreases by a constant factor in each round t. Thus, after at most ) rounds, there is at most one scenario remaining, at which point the algorithm will have identified all of the true preference orders . By running the Gale-Shapley Algorithm using those preference orders, the algorithm then ensures that it has found a stable matching.

3.2 Representative Rankings and Proportional Transitivity

The basic idea of our algorithm is the same as in the ) algorithm of Section 1. In each round, the algorithm chooses a speculative preference order for each agent, and then computes a stable matching with respect to those speculative preference orders. The important improvement is to choose the speculative preference orders for all agents carefully. Specifically, we want all preference orders to be representative of the corresponding , in the following sense.

Definition 3.1 (Representative preference order). Let C be a set of constraints (forming a partial order), and P the set of all linear extensions of ) be the fraction of linear orders . We say that iff for all pairs (whenever

It is not a priori clear that an -representative preference order even exists for a constant This is established by the following lemma.

be arbitrary, and let be the set of all ordered pairs (such that at least an fraction of the linear extensions of C rank a ahead of is acyclic, and thus defines a partial order. In particular, every linear extension of -representative preference order.

The proof of Lemma 3.2 relies heavily on Yu’s Proportional Transitivity Theorem (Yu, 1998).

Theorem 3.3 (Theorem 3 of Yu, 1998)constant. Let P be any partial order. For every

Theorem 3.3 is remarkable: it states that if a large enough fraction of linear extensions of a partial order have , and (at least) the same fraction have least) the same fraction have = 1, this statement is of course trivial (because it exactly captures transitivity of partial orders), but that it should hold for a constant p < 1 is not at all obvious.

Proof of Lemma 3.2. Assume (for contradiction) that ) contains a cycle 1. Applying Theorem 3.3 with repeatedly, we obtain that . But this gives a contradiction with the fact that

The total number of iterations required by Algorithm 1 is characterized by the following theorem.

Algorithm 1 terminates after at most iterations.

8, by Lemma 3.2, the algorithm can always find representative preference orders to use as speculative preference orders.

Consider any iteration t in which a blocking pair (with respect to the true preferences ) was revealed. Because the Gale-Shapley Algorithm finds a stable matching, the pair {w, m} was not blocking for the preferences used in running the Gale-Shapley Algorithm. In particular, this means that at least one of w, m preferred their assigned partner over the other, i.e., . Without loss of generality, assume that

Remark 1. The number of iterations is , but because is an absolute constant (e.g., in the efficient implementation to follow), the number of iterations is simply

The implementation of Algorithm 1, as given above, is not efficient, because we did not specify how to find representative preference orders. One way to do so is to enumerate all linear extensions of , and use this enumeration to compute the set Lemma 3.2 explicitly. Given ), finding a linear extension is simply a topological sort. Of course, the enumeration may take time Ω(n!). In the next section, we give an efficient randomized implementation, which terminates in ) iterations with high probability.

3.4 An Efficient Randomized Implementation

In this section, we describe and analyze a randomized algorithm for sampling representative preference orders for a set of constraints (partial order) C. The algorithm is given as Algorithm 2, and relies heavily on the following theorem of Huber (2006), which is based on the Karzanov-Khachiyan Chain (Karzanov and Khachiyan, 1991) and a bounding chain:

Theorem 3.5 (Lemma 10 of Huber, 2006). There exists a randomized algorithm for sampling linear extensions of a partial order C, with the following guarantees:

1. The algorithm samples linear extensions exactly uniformly from all linear extensions of C.

2. The expected running time of the algorithm is at most 4

3. For every integer k, with probability at least 1, the running time of the algorithm is at most

The two key observations about Algorithm 2 are the following: (1) the set S of sampled constraints is acyclic with high probability, ensuring (fast) termination of the algorithm; and (2) with high probability, ˆis 0.9-representative.

To prove both statements, we begin with a simple tail bound, capturing that the ˆaccurate estimates of denote the event that all estimates ˆsimultaneously satisfy that

Lemma 3.6. Prob[

Proof. First focus on one pair (). By Theorem 3.5, the linear extensions are sampled exactly from the uniform distribution.Therefore, we have ). The Hoeffding Bound(Hoeffding, 1963) guarantees that Probbound over allnow implies that all estimates are accurate to within an additive 0.05 with probability at least 1

Lemma 3.7. Under E, the set S of preference constraints is acyclic.

Proof. Because we are conditioning on the event E, for all constraints , we must have that 8. Acyclicity now follows from Lemma 3.2, because ) for the set defined in that lemma.

Lemma 3.8. Under E, the preference order ˆreturned by Algorithm 2 is 0.9-representative.

be such that 9. Because we are conditioning on E, this implies that ˆ85, so by definition, is a linear extension of S, we get that

We now return to the analysis of Algorithm 1, proving the following theorem: is used to compute the specu- lative preference order in Line 6 of Algorithm 1, then Algorithm 1 terminates after at most

iterations with high probability; furthermore, with probability at least 1iteration runs in time

Proof. Consider running Algorithm 1 for Θ() iterations, with a suitably large constant. Let N be the number of those iterations in which the event E held. By Lemma 3.6, N is lowerbounded by a Binomialrandom variable. By Chernoff Bounds, with high probability, ), again for a suitably large constant.

During any iteration t in which E did not hold, we simply (pessimistically) bound For the remaining iterations, Lemma 3.8 guarantees that the are 0.9 representative. Therefore, using 9, Theorem 3.4 guarantees that when ), the algorithm terminates with a stable matching.

Finally, we analyze the running time. The running time of Algorithm 2 is dominated by drawing O(log n) independent uniformly random linear extensions of C. By Theorem 3.5 with k = 6, each draw takes time ) with probability at least 1. The failure probability of event E is at most , so with probability at least 1, the number of times Algorithm 2 is repeated before succeeding is at most 5, and in particular constant. Next, we take a union bound over the following events: (1) each of the O(log n) calls to Huber’s algorithm does not take too long, and (2) Algorithm 2 is not repeated more than 5 times. By this union bound, Algorithm 2 runs in time ) with probability at least 1

In each iteration t, Algorithm 1 in Line 6 needs to “construct” -representative preference orders for for all agents a. In the first iteration, this is trivial, as any order is -representative. In subsequent iterations, only the orders for the two agents who were part of blocking pairs may have to be updated. Thus, each iteration requires two calls to Algorithm 2, for a total time of By a union bound over these two invocations of Algorithm 2, each iteration of Algorithm 1 takes time ) in calls to Algorithm 2, with probability at least 1). This (high-probability) running time of ) dominates the running time of the Gale-Shapley Algorithm and of updating the constraint sets in each iteration.

Finally, by a union bound over all ) iterations of Algorithm 1, the running time guarantee holds with probability at least 1

Proof. The high-level idea of the proof is quite straightforward, but a formal proof needs some care in the details. We begin with the very straightforward case of adversarially chosen preferences, and then discuss the intuition of the proof for random preferences. Let the men ranked by the common (and known) preference of all women. For each man be his preference order over women. As is well known, the unique stable matching when women have common preferences is the result of a serial dictatorship of the men in the order of these preferences. That is, is matched with his first choice, is matched with his first choice among the remaining women, and so on.

An adversary can use the following strategy. Whatever matching the algorithm proposes in the first iteration, the woman w matched to will be adversarially placed as his last-ranked choice, and the algorithm will reveal a blocking pair of with some other woman, and will continue revealing that same blocking pair whenever is matched with w. Whatever different woman is next matched with (whether it is the one that was revealed or not) will be adversarially placed as ’s second-to-last choice, and so on. Once the algorithm has matched n women, the the last woman to be matched with him is known to be his first choice. At this point, the adversary will continue with the same strategy for with the remaining Because only blocking pairs involving have been revealed up to that point, the ranking of completely undetermined. The adversary continues in this way for all men in order. In total, this adversary strategy will force any algorithm to use at least ) rounds of queries.

When the men have i.i.d. uniformly random preference rankings, the high-level idea remains the same. For any i, until the algorithm has identified the correct matches of adversary will not reveal any blocking pair involving . However, in contrast with the case of adversarial rankings, once all of these 1 matches have been identified, the adversary cannot ensure any more that the woman is matched is the lowest remaining in his ranking. Nevertheless, the adversary can still reveal a block with the woman immediately preceding ’s ranking. Intuitively (we will make this argument precise, of course), this “does not reveal much information” besides the fact that ’s correct match, so the algorithm in expectation still has to try a constant fraction of all remaining women as

To precisely define the adversary’s behavior in the random-preferences case, fix realized preferences for all men. We inductively define the following women and sets of women. is the set of all women. For each be the top choice of among the women in , according to the uniformly random (but fixed for now) preference ranking . Then, let unique stable matching has each matched with . The adversary behaves as follows: if the proposed matching is not stable, then let . Reveal the blocking pair consisting of , where the latter is the ) among the set according to

Making the preceding informal argument — about this adversary not revealing “much information” — precise requires some care. In particular, we must reason about what the algorithm has “learned” about the uniformly random preferences of the men. Our analysis will therefore use the Principle of Deferred Decisions, and consider the random preferences of the men as if they are drawn in stages over the course of the algorithm’s execution. However, these deferred random choices will of course be sound, as we will show that the final preferences are i.i.d. uniformly random, regardless of the matchings proposed by the learning algorithm A. Due to the probabilistic nature of our analysis, will henceforth be treated as random variables.

In each round t, the algorithm A proposes a matching . The adversary reveals a blocking pair involving if and only if . Naturally, an adversary following this strategy will reveal to the algorithm that Because blocking pairs involving are only revealed in this case, we obtain the following first key property: (1) each blocking pair for and some other . Since the adversary will always reveal a w who is the immediate predecessor of , this implies a second key property: (2) , which to avoid clutter we denote by , defines a union of disjoint chains on

In our analysis, we consider as already drawn the following information about all of the at all times t, which contains all of the information known to the algorithm: (1) The preference constraints (the blocking pairs involving already revealed to the algorithm), and (2) possibly the realization of the random , if it has been revealed, in which case we consider the entire preference relation as having been drawn already. Specifically, this information is tracked using the variables , where initially we have . Whenever the adversary commits to a blocking pair, we update the variable , and whenever the adversary commits to the value of , we update the variable to reflect the realized value of and update the variable to reflect the realization of all of the number of chains in the partial order defined by . The adversary’s strategy alongside the gradual draw of preferences, as viewed from the Deferred Decision point of view of our analysis, is given in Algorithm 3.

We first verify that Algorithm 3, using deferred decisions, generates uniformly random rankings regardless of the proposed matchings . To see that this, recall the key properties (1) and (2) above, which state that the revealed constraints into disjoint chains of immediate predecessors. In a uniformly random permutation, the order of these chains is uniformly random. In particular, when there are chains, each woman without a known predecessor is first in the ranking with probability . The bias of the random coin flip in Line 8 guarantees that when any woman w without a known predecessor is considered, she is revealed to be first in the ranking with exactly this probability . Conditioned on not being first overall in the ranking, w has a predecessor who is uniformly random among the last elements of all other 1 chains. The draw of in Line 10 exactly matches this distribution.

Finally, we analyze the number of iterations that A needs. In Step 8 of Algorithm 3, the probability that is revealed (and thus the algorithm can move on to finding

and the first iterations for ), we obtain that the probability of finding is at most . Thus, in expectation, it takes at least iterations each to identify , implying that A needs at least iterations in expectation.

3.6 Many-to-Many Matchings with Extra Information

Our algorithms and analysis would have remained essentially unchanged for the case of many-to-many matchings had the algorithm, along with a blocking pair ) such that . In this case, the extra information would have still allowed the algorithm to add at least one new constraint in each iteration. So Algorithm Sample-Representative-Preference (Algorithm 2) could have been used entirely unchanged, and the standard many-to-many variant of the Gale-Shapley Algorithm could have been used to compute a stable matching with respect to the sampled preference orders.

Computing a stable many-to-many matching with respect to unknown preferences becomes significantly more challenging when the algorithm only sees the blocking pair (w, f), with no additional information about which agent in ) should be “swapped out.” The modified algorithms and analysis for this case are the subject of Section 4.

4 Many-to-Many Matchings

In this section, we describe and analyze algorithms for many-to-many matchings when the response to a query reveals only a blocking pair (w, f), but not the current matches such that

We obtain two results. First, in Section 4.1, we assume that all quotas are bounded, and write . We give a simple learning algorithm that takes at most ) rounds of interaction, with computation time at most ) per round — this algorithm is a conceptual generalization of the simple ) algorithm for one-to-one matchings discussed in Section 1. Then, in Section 4.2, we present a more involved algorithm that only requires ) rounds of interaction, regardless of the quotas. While the number of interactions is always polynomial, the computational requirements are not: we are currently not aware of any implementation that is not exponential in n. We show that being able to efficiently sample preference orders subject to constraints more involved than pairwise comparisons would be sufficient to obtain an efficient implementation.

The algorithms for both cases are based on modifying Algorithm 1. The main difference is the type of information that the algorithm obtains from a blocking pair. Whereas previously, a blocking pair {w, f} revealed that ), it now only reveals that w prefers f over (at least) one participant in

For the modified algorithm, will be a set of constraints of the form ; in these constraints, all sets S will have the same size, namely, . The algorithm uses speculative preference orders that are “representative” of in a sense that we will describe below. The modified learning algorithm is given by Algorithm 4.

The difference between the analyses in Sections 4.1 and 4.2 is solely in the choice of the speculative preferences, and the corresponding guarantees that can be derived on the progress of Algorithm 4 and the running time of the implementation. Specifically, in Section 4.1 we will allow the speculative preferences to be arbitrary preferences that are consistent with , while in Sec- tion 4.2 we will use speculative preferences that are representative of with respect to a notion of representativeness that will be defined iteratively in terms of fractions of being ranked last.

4.1 Small Quotas

The simplest way to implement Line 6 in Algorithm 4 is to let preference order consistent with all the constraints in . We show that such a preference order can be found efficiently.

Proposition 4.1. Assume that there exists at least one linear order ˆconsistent with C. Then, such an order can be found in time

Proof. The algorithm is a generalization of the standard Topological Sort algorithm for directed acyclic graphs. It scans through all the constraints and counts, for each relevant or , depending on whether a preference order is being constructed for some some ), how many constraints there are of the form . Because each constraint implies that a cannot be last in the preference order, and we assumed that there is a linear order ˆconsistent with C, there must be at least one a for which there are no constraints

The algorithm picks any such a and puts it in the last remaining place of the preference order. It then removes all constraints of the form , because they have been satisfied by placing a last. It then continues with the remaining constraints for the remaining positions.

The algorithm takes n iterations, each of which take time O(C) for the scan and removal.

While the generalization of Topological Sort to the more general types of constraints is fairly straightforward, we remark on a subtlety that does not arise for standard Topological Sort. A slight generalization of the types of constraints we consider is to allow constraints both of the form “a must precede at least one element of must follow at least one element of a singleton, these types of constraints are the same. When the sets S can be larger, the types of constraints are different, and we showed that when only one type of constraint is allowed, a feasible assignment can be found efficiently if one exists. (And otherwise, the algorithm can determine that no feasible assignment exists.) This ceases to be true when both types of constraint are combined:

Proposition 4.2. The following problem is NP-hard: given a set C of constraints, each of which is of the form (1) must precede at least one element of must follow at least one element of , decide whether there exists a linear order on the elements satisfying all the constraints. NPhardness holds even when

Proof. We give a straightforward reduction from 3SAT. Given an instance of and , each of which is a disjunction of three literals, the reduction constructs an instance with 2n+1 elements and n+m constraints. There are two elements for each variable , and one more element ˆy. For each variable , there is a constraint that ˆy must follow at least one of . For each clause , there is a constraint that ˆy must precede at least one of the elements corresponding to the ; for example, if , then the constraint says that ˆy must precede at least one of the elements

The correctness of the reduction can be easily observed by interpreting all the literals after ˆy as true. The first set of constraints captures that no assignment may make both the second set of constraints captures that each clause must have at least one true literal.

We can now prove our main theorem for the simple algorithm:

Theorem 4.3. If the algorithm of Proposition 4.1 is used to compute the speculative preference order in Line 6 of Algorithm 4, then Algorithm 4 terminates after at most iterations; furthermore, each iteration runs in time at most

Proof. The proof follows in the footsteps of that for Theorem 3.9. Assume that in iteration t, a blocking pair is revealed, implying that for some ). Again, by the non-blocking property of a stable matching with respect to the , we can infer that at least one of prefers each of holds. Without loss of generality, holds. In particular, because satisfies all constraints in , we obtain that , implying that

Therefore, after each round, the quantity Φincreases by at least 1. The size of each constraint set is upper-bounded by ). This implies that Φ), so the process terminates after at most ) iterations. Similarly, because the algorithm of Proposition 4.1 runs in time ) for each agent (and since the running time of each iteration is dominated by this), each iteration runs in time at most

For the one-to-one matching case, as discussed in the introduction, this proves a bound of on the deterministic query complexity, which is obviously worse by a factor of compared to the bound we obtained in Section 3 with a more careful choice of speculative preference orders.

4.2 General Quotas

In this section, our goal is to improve the guarantee of Theorem 4.3 to eliminate the (exponential) dependence on q. As in Section 3, we do so by a more careful choice of the speculative preference orders used in Algorithm 4. Unlike in Section 3, our implementation will not be computationally efficient.

Our new sampling algorithm is a modification of Algorithm 2. It repeatedly estimates the fractions of valid preference orders (consistent with the given constraints) that have an element a in last position, among a set of remaining elements. For any set C of constraints, set U, and agent ) denote the fraction of preference orders consistent with C that have a ranked last among the agents in U. The algorithm, in each iteration t, will have a set of elements still in need of ranking, and chooses the last element based on the ) (or estimates thereof in Section 4.3). It is detailed in Algorithm 5.

Akin to the analysis of representative preference orders in Section 3, we will show that the preference order ˆproduced by Algorithm 5 guarantees a reasonable reduction in the number of remaining consistent preference orders.

be the preference order returned by Algorithm 5. Let be any constraint that holds in more than a 1fraction of preference orders consistent with at least one element of

Because the algorithm chooses ¯), it ensured in particular that . In particular, because no element of S had been previously placed, and since was assumed to be a constraint holding in (strictly) more than a 1fraction of the pref- erence orders consistent with C, we get that ¯, so by definition of t, an element of S must be placed in iteration t. This implies that precedes at least one element of S, namely, ¯

is used to compute the speculative preference order in Line 6 of Algorithm 4, then Algorithm 4 terminates after at most iterations.

Proof. As in the proof of Theorem 3.4, consider any iteration t in which a blocking pair (with respect to the true preferences ) was revealed. Because the Gale-Shapley Algorithm finds a stable matching, the pair {w, f} was not blocking for the preferences used in running GaleShapley. In particular, this means that at least one of w, f preferred all of their assigned matches over the other. Without loss of generality, assume that

By Lemma 4.4, applied to , we infer that at least a fraction of the preference orders consistent with . The revelation of the blocking pair {w, f} rules out all of the linear orders . Therefore, . In turn, this implies that

We obtain that Algorithm 4 terminates after at most ) iterations.

As in the case of Theorem 3.4, Theorem 4.5 makes no statement about computational efficiency. Unlike in the case of one-to-one matchings, we are not aware of any efficient implementation. However, in the remainder of this section, we will show that as with one-to-one matchings, an ability to sample (nearly) uniformly preference orders from among all satisfying a given constraint set would be sufficient to achieve an efficient implementation. The obstacles to sampling are discussed in more depth in Section 5.

4.3 Estimations via Sampling

Instead of the actual values ), Algorithm 5 can be run with estimates ˆcally, in each iteration t, the algorithm can sample ) linear orders independently and uniformly from the set of all linear orders satisfying all constraints the fraction of sampled orders that have a ranked last among . The algorithm uses the estimates ˆ) in place of the actual ). We will refer to this variant of Algorithm 5 as Sample-Representative-Preference-Large-Constraints.

We reiterate that we are not aware of any computationally efficient algorithm to sample such , even approximately uniformly. Nonetheless, we will analyze the impact of such sam- pling, and show that the remainder of the algorithm works as before.

We define (high-probability, as we will show) events , which jointly will guarantee that the samples are sufficiently accurate. Specifically, is defined as the event that all estimates ˆin round t are accurate estimates to within at most . More formally, the event as. We begin by showing a suitable high-probability guarantee on these events.

Lemma 4.6. For all t and all sets U, we have that Prob[

Proof. Consider any iteration t and condition on be arbitrary. Because the are sampled uniformly randomly, by definition, ). The Hoeffding Bound

guarantees that Prob. A union bound over all of the at most now implies that all estimates are accurate to within an additive

Sample-Representative-Preference-Large-Constraints(C) are accurate up to an additive

. By the union bound, Prob[. The key implication of the event E is captured by the following lemma, which is a slightly weaker version of Lemma 4.4:

Lemma 4.7. Assume that the event E happened, and let ˆbe the preference order returned by be any constraint that holds in more than a 1fraction of preference orders consistent with at least one element of S.

Proof. The proof is nearly identical to that of Lemma 4.4. In the iteration when an element of is placed for the first time, the algorithm uses the estimates ˆinstead of the event E, the fact that by assumption and that the estimate is accurate to within an additive means that ˆ, so again, cannot be placed in this iteration.

We obtain the following theorem.

(i.e., Al- gorithm 5 when run using random sampling-based estimates) is used to compute the preference order in Line 6 of Algorithm 4, then Algorithm 4 terminates after at most iterations with high probability; furthermore, each iteration requires drawing samples for n iterations of the for loop of Algorithm 5, for each of the n agents. Thus, each iteration requires drawing

Proof. The proof is practically identical to that of Theorem 3.9. Because E has high probability, if the algorithm is run for Θ() iterations, the number of iterations in which its speculative preference orders are such that they sufficiently reduce the number of candidate scenarios is also Θ(). The other iterations do not guarantee progress, but cannot increase the number of remaining scenarios. Thus, up to constant, we achieve the same guarantees as the deterministic Theorem 4.5.

5 Conclusions

We presented interactive algorithms for learning stable matchings, both in the case of one-to-one matchings and in the case of many-to-many matchings. The algorithm repeatedly proposes a matching; if the matching is not stable, the algorithm is informed of one blocking pair. We showed that in the case of one-to-one matchings, there is a computationally efficient interactive algorithm that finds a stable matching after at most ) rounds of interaction with high probability. We also showed that any algorithm for finding a stable matching in this model requires Ω(rounds of interaction, even in expectation.

Our algorithm extends to the case of many-to-many matchings, where the guarantee on the number of rounds deteriorates to However, this algorithm is not computationally efficient; an “efficient” version can be constructed which requires ) iterations when each agent can be matched with at most q other agents.

The most immediate open question is whether the algorithm with ) rounds of interactions can be implemented in polynomial time. A sufficient condition for doing so would be to be able to sample (approximately) uniformly from the set of all linear orders consistent with a set of constraints of the form “element z must precede at least one element from the set an interesting sampling question in its own right, and it also has applications in other scenarios in which a principal is trying to learn preference rankings of agents from observed behavior, e.g., in the case of learning valuations of single-minded buyers from their product choices.

This sampling problem appears non-trivial. As we discussed (and heavily used), the special case when all sets S have size 1 is exactly the problem of sampling a linear extension of a given partial order. This is accomplished by the well-known Karzanov-Khachiyan chain (Karzanov and Khachiyan, 1991), which starts from any feasible linear extension (e.g., computed by Topological Sort), and repeatedly picks a uniformly random index i, swapping elements i and i + 1 with probability if doing so is consistent with the partial order constraints, and doing nothing otherwise. Karzanov and Khachiyan show rapid mixing of this chain by studying conductance properties of

the order polytope associated with the partial order (Stanley, 1986). Bubley and Dyer (1999) show that giving a slight bias towards sampling indices i towards the middle of the ranking leads to a much simpler analysis based on a beautiful and elementary path coupling. One could still aim to prove rapid mixing for the Karzanov-Khachiyan chain, applied to constraints in which the sets S are larger than singletons. Unfortunately, the natural representation of constraints in n-dimensional space ceases to be convex, so it is not clear whether the conductance arguments can be applied. Similarly, the specific path coupling of Bubley and Dyer also seems to break down for more complex constraints. An open question with fewer implications, but possible technical interest, is whether the number of iterations can be reduced from ) when all agents’ capacities are bounded by q. One would not expect a better bound, because even when the rankings are not constrained at all (e.g., in the first iteration), a constraint of the form only rules out a fraction of rankings. A sufficient result would be a generalization of Yu’s Proportional Transitivity Theorem (Yu, 1998). A possible formulation of such a result would be the following. For a given set C of constraints of the form , where each be the set of all constraints with such that at least a fraction of all rankings consistent with Theorem 3.3 can then be restated as saying that when q = 1, for every , we have that

) such that when each constraint , for every constraint set C satisfies C

from that of Yu (1998). Yu’s proof is based on an application of Shepp’s XYZ inequality (Shepp, 1982), combined with a dimension reduction technique of Ball (1988) (see also a similar proof in Kahn and Yu (1998)). Ball’s result very heavily relies on the set under consideration (here, the order polytope) being convex, since it relies on the Brunn-Minkowski Inequality. As discussed in the preceding paragraphs, the natural n-dimensional representation of larger constraints ceases to be convex, so one of the key techniques will fail to be available.

References

Heiner Ackermann, Paul W. Goldberg, Vahab S. Mirrokni, Heiko R¨oglin, and Berthold V¨ocking. 2011. Uncoordinated Two-Sided Matching Markets. SIAM J. Comput. 40, 1 (2011), 92–106. A preliminary version appeared at EC’08.

Dana Angluin. 1988. Queries and Concept Learning. Machine Learning 2 (1988), 319–342.

Keith Ball. 1988. Logarithmically concave functions and sections of convex sets in Mathematica 88, 1 (1988), 69–84.

Xiaohui Bei, Ning Chen, and Shengyu Zhang. 2013. On the Complexity of Trial and Error. In Proc. 45th ACM Symp. on Theory of Computing. 31–40.

Russ Bubley and Martin Dyer. 1999. Faster Random Generation of Linear Extensions. Discrete Mathematics 201, 1 (1999), 81–88.

Xi Chen, Decheng Dai, Ye Du, and Shang-Hua Teng. 2009. Settling the Complexity of Arrow- Debreu Equilibria in Markets with Additively Separable Utilities. In Proc. 50th IEEE Symp. on Foundations of Computer Science. 273–282.

Xi Chen and Xiaotie Deng. 2006. Settling the Complexity of Two-Player Nash-Equilibrium. In Proc. 47th IEEE Symp. on Foundations of Computer Science. 261–270.

Xi Chen, Dimitris Paparas, and Mihalis Yannakakis. 2017. The Complexity of Non-Monotone Markets. J. ACM 64, 3 (2017), 20:1–20:56.

Jen-Hou Chou and Chi-Jen Lu. 2010. Communication Requirements for Stable Marriages. In Proc. 7th Intl. Conf. on Algorithms and Complexity (CIAC). 371–382.

Kim-Sau Chung. 2000. On the Existence of Stable Roommate Matchings. Games and Economic Behavior 33, 2 (2000), 206–230.

Constantinos Daskalakis, Paul W. Goldberg, and Christos H. Papadimitriou. 2009. The Complexity of Computing a Nash Equilibrium. SIAM J. Comput. 39, 1 (2009), 195–259. A preliminary version appeared at STOC’06.

Effrosyni Diamantoudi, Eiichi Miyagawa, and Licun Xue. 2004. Random paths to stability in the roommate problem. Games and Economic Behavior 48, 1 (2004), 18–28.

Ehsan Emamjomeh-Zadeh and David Kempe. 2017. A General Framework for Robust Interactive Learning. In Proc. 31st Advances in Neural Information Processing Systems. 7082–7091.

Ehsan Emamjomeh-Zadeh, David Kempe, and Vikrant Singhal. 2016. Deterministic and Proba- bilistic Binary Search in Graphs. In Proc. 48th ACM Symp. on Theory of Computing. 519–532.

David Gale and Lloyd S. Shapley. 1962. College Admissions and the Stability of Marriage. Amer. Math. Monthly 69, 1 (1962), 9–15.

Yannai A. Gonczarowski, Noam Nisan, Rafail Ostrovsky, and Will Rosenbaum. 2019. A Stable Marriage Requires Communication. Games and Economic Behavior 118 (2019), 626–647. A preliminary version appeared at SODA’15.

Julien Grenet, Yinghua He, and Dorothea K¨ubler. 2019. Decentralizing Centralized Matching Markets: Implications from Early Offers in University Admissions. (2019). Mimeo.

Avinatan Hassidim, D´eborah Marciano, Assaf Romm, and Ran I. Shorrer. 2017. The Mechanism is Truthful, Why aren’t You? American Economic Review Papers and Proceedings 107, 5 (2017), 220–24.

Wassily Hoeffding. 1963. Probability Inequalities for Sums of Bounded Random Variables. J. Amer. Statist. Assoc. 58 (1963), 13–30.

Mark Huber. 2006. Fast Perfect Sampling from Linear Extensions. Discrete Mathematics 306, 4 (2006), 420–428.

Jeff Kahn and Yang Yu. 1998. Log-Concave Functions and Poset Probabilities. Combinatorica 18, 1 (1998), 85–99.

Alexander Karzanov and Leonid Khachiyan. 1991. On the Conductance of Order Markov Chains. Order 8, 1 (1991), 7–15.

Bettina Klaus and Flip Klijn. 2007. Paths to stability for matching markets with couples. Games and Economic Behavior 58, 1 (2007), 154–171.

Donald E. Knuth. 1976. Marriage stables et leurs relations avec d’autres probl`emes combinatoires. Les Presses de l’Universit´e de Montr´eal.

Fuhito Kojima and M. Utku ¨Unver. 2006. Random paths to pairwise stability in many-to-many matching problems: a study on market equilibration. International Journal of Game Theory 36 (2006), 473–488.

Jinpeng Ma. 1996. On Randomized Matching Mechanisms. Economic Theory 8, 2 (1996), 377–381.

Yusuke Narita. 2018. Match or Mismatch? Learning and Inertia in School Choice. (2018). Mimeo.

Cheng Ng and Daniel Hirschberg. 1990. Lower Bounds for the Stable Marriage Problem and Its Variants. SIAM J. Comput. 19, 1 (1990), 71–77.

Christos H. Papadimitriou. 2007. The Complexity of Finding Nash Equilibria. In Algorithmic Game Theory, Noam Nisan, Tim Roughgarden, Eva Tardos, and Vijay V. Vazirani (Eds.). Cambridge University Press, 29–52.

Baharak Rastegari, Anne Condon, Nicole Immorlica, Robert Irving, and Kevin Leyton-Brown. 2014. Reasoning about optimal stable matchings under partial information. In Proc. 15th ACM Conf. on Economics and Computation. 431–448.

Alvin E. Roth. 1984. Stability and Polarization of Interests in Job Matching. Econometrica 52, 1 (1984), 47–57.

Alvin E Roth. 1985. The college admissions problem is not equivalent to the marriage problem. Journal of Economic Theory 36, 2 (1985), 277–288.

Alvin E. Roth and Marilda A. Oliveira Sotomayor. 1990. Two-Sided Matching: A Study in GameTheoretic Modeling and Analysis. Cambridge University Press.

Alvin E. Roth and John H. Vande Vate. 1990. Random Paths to Stability in Two-Sided Matching. Econometrica 58, 6 (1990), 1475–1480.

Aviad Rubinstein. 2016. Settling the complexity of computing approximate two-player Nash equi- libria. In Proc. 57th IEEE Symp. on Foundations of Computer Science. 258–265.

Aviad Rubinstein. 2018. Inapproximability of Nash Equilibrium. SIAM J. Comput. 47, 3 (2018), 917–959.

Ilya Segal. 2007. The Communication Requirements of Social Choice Rules and Supporting Budget Sets. Journal of Economic Theory 136 (2007), 341–378.

Larry A. Shepp. 1982. The XYZ Conjecture and the FKG Inequality. The Annals of Probability 10, 3 (1982), 824–827.

Richard P. Stanley. 1986. Two Poset Polytopes. Discrete Computational Geometry 1 (1986), 9–23.

Akihisa Tamura. 1993. Transformation from arbitrary matchings to stable matchings. Journal of Combinatorial Theory, Series A 62, 2 (1993), 310–323.

Yang Yu. 1998. On Proportional Transitivity of Ordered Sets. Order 15 (1998), 87–95.

A Inapplicability of the Binary Search in Graphs Framework

In Emamjomeh-Zadeh and Kempe (2017), a framework was proposed for analyzing the number of rounds nece