28:["$","$L31",null,{"isWhiteLabelled":false,"children":["$","$Lc",null,{"pt":{"compact":0,"expanded":3},"children":[["$","$L32",null,{"noStar":true,"publisher":true,"task":true,"params":true,"size":"xl","product":{"id":"eyJwYXBlcklEIjoiMjQwNi4wNDI1MCIsInB1Ymxpc2hlciI6ImFyeGl2In0=","publisher":"arxiv","updated":"2024-06-06T16:54:20.000Z","paperID":"2406.04250","published":"2024-06-06T16:54:20.000Z","authors":"[\"Asad Raza\",\"Matthias C. Caro\",\"Jens Eisert\",\"Sumeet Khatri\"]","title":"Online learning of quantum processes","scoreTrending":null,"summary":"$33","lastCheckedForCode":"2024-06-08T10:46:53.820Z","links":[{"id":"eyJ1cmwiOiJodHRwczovL3BhcGVyc3dpdGhjb2RlLmNvbS9wYXBlci9vbmxpbmUtbGVhcm5pbmctb2YtcXVhbnR1bS1wcm9jZXNzZXMifQ==","type":"pwc","url":"https://paperswithcode.com/paper/online-learning-of-quantum-processes","data":"{\"date\":\"2024-11-28T22:18:25.114Z\"}"}],"reposConnection":{"edges":[]},"models":[],"tags":[],"summaries":[],"emailsConnection":{"edges":[]},"__typename":"paper","authorArray":["Asad Raza","Matthias C. Caro","Jens Eisert","Sumeet Khatri"]}}],["$","$L25",null,{"container":true,"columns":100,"spacing":{"compact":0,"expanded":2,"large":3},"children":[["$","$L25",null,{"size":{"compact":100,"expanded":100,"large":68},"children":[["$","$8",null,{"children":["$","$L34",null,{"publisher":"arxiv","paperID":"2406.04250","product":{"paper":"$28:props:children:props:children:0:props:product","models":"$28:props:children:props:children:0:props:product:models"},"isWhiteLabelled":false}]}],["$","$8",null,{"children":["$","$L35",null,{"article":"$L36","model":"$undefined"}]}]]}],["$","$L25",null,{"size":"grow","children":["$","$L37",null,{}]}]]}],["$","$8",null,{"children":null}],[["$","audio",null,{"id":"tts"}],["$","$L38",null,{"paperID":"2406.04250","publisher":"arxiv","paperJSON":{"title":"Online learning of quantum processes","paperID":"2406.04250","avgLineHeight":13.59,"imgScale":4,"sections":[{"heading":"Abstract","paragraphs":[[{"text":"$39","element":"span"}]]},{"heading":"1 Introduction","paragraphs":[[{"text":"Learning about quantum systems and their evolution over time is a task of fundamental importance in quantum physics. “Learning”, broadly speaking, refers to the extraction of useful classical information from quantum-mechanical systems and their evolution through experiments. Such learning tasks first appeared in quantum information in the form of quantum state and process tomography, in which the aim is to extract all classical information about the system, in terms of the density matrix of the system in the case of quantum state tomography [","element":"span"},{"href":"#id-0","referenceIndex":1,"text":"1","element":"a"},{"text":"–","element":"span"},{"href":"#id-1","referenceIndex":6,"text":"6","element":"a"},{"text":"], and the transfer matrix in the case of quantum process tomography [","element":"span"},{"href":"#id-2","referenceIndex":7,"text":"7","element":"a"},{"text":"–","element":"span"},{"href":"#id-3","referenceIndex":9,"text":"9","element":"a"},{"text":"]. These tomographic tasks, with their strict measures of performance in terms of worst-case distance measures such as the trace and diamond","element":"span"}],[{"id":"id-15","style":{"width":"70%"},"width":1315,"height":789,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/1-0.png","element":"img"}],[{"text":"Figure 1: ","element":"figcaption","subtype":"caption"},{"style":{"fontWeight":"bold"},"text":"Learning of quantum processes. ","element":"figcaption","subtype":"caption"},{"text":"(a) To learn about the unknown evolution of a quantum system (symbolized by the blue shaded region and represented mathematically by the quantum channel ","element":"figcaption","subtype":"caption"},{"style":{"fontStyle":"italic"},"text":"N","element":"figcaption","subtype":"caption"},{"text":"), we prepare a probe quantum state ","element":"figcaption","subtype":"caption"},{"style":{"height":10},"width":21,"height":25,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/1-1.png","element":"img","alt":" ρ","inline":true},{"text":", let it evolve, and then measure it according to the POVM ","element":"figcaption","subtype":"caption"},{"style":{"height":19.94},"width":212.34,"height":49.85,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/1-2.png","element":"img","alt":" {M, 1 − M}","inline":true},{"text":". We encapsulate this process in the circuit diagram shown in (b). (c) More generally, we can prepare an entangled probe state of two systems, let only one of them evolve, and then jointly measure both systems. Our results apply to this more general class of tests, and also more generally to classes of multi-time quantum processes, in which the unknown evolution could be non-Markovian.","element":"figcaption","subtype":"caption"}],[{"text":"norm, require resources scaling exponentially with the system size [","element":"span"},{"href":"#id-4","referenceIndex":10,"text":"10","element":"a"},{"text":"–","element":"span"},{"href":"#id-5","referenceIndex":14,"text":"14","element":"a"},{"text":"]. Consequently, recent years have seen a growing interest in less strict variants of state and process learning, defined by relaxing the requirement of extracting full classical information about the objects of interest and instead requiring that we learn only the values of certain observables of our state or process. In the case of quantum states, notable such learning tasks include “pretty good tomography” in the spirit of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"probably approximately correct ","element":"span"},{"text":"(PAC) learning [","element":"span"},{"href":"#id-6","referenceIndex":15,"text":"15","element":"a"},{"text":"], shadow tomography [","element":"span"},{"href":"#id-7","referenceIndex":16,"text":"16","element":"a"},{"text":"–","element":"span"},{"href":"#id-8","referenceIndex":19,"text":"19","element":"a"},{"text":"], online learning [","element":"span"},{"href":"#id-9","referenceIndex":20,"text":"20","element":"a"},{"text":", ","element":"span"},{"href":"#id-10","referenceIndex":21,"text":"21","element":"a"},{"text":"], and classical shadows [","element":"span"},{"href":"#id-11","referenceIndex":22,"text":"22","element":"a"},{"text":"–","element":"span"},{"href":"#id-12","referenceIndex":24,"text":"24","element":"a"},{"text":"]. Inspired by this progress in understanding state learning, also new perspectives on quantum channel learning have been explored [","element":"span"},{"href":"#id-13","referenceIndex":25,"text":"25","element":"a"},{"text":"–","element":"span"},{"href":"#id-14","referenceIndex":32,"text":"32","element":"a"},{"text":"].","element":"span"}],[{"text":"We consider the following channel learning task, see also Figure ","element":"span"},{"href":"#id-15","text":"1","element":"a"},{"text":". Let ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":"be an unknown quantum channel, describing the evolution of a quantum system. In order to learn the behavior of the channel, we can prepare our system in a state of our choice, let it evolve according to the channel ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N","element":"span"},{"text":", and then measure the output. The input state and measurement constitute a “test” for the quantum channel, and the probability of “passing” the test is given by Tr[","element":"span"},{"style":{"height":18.4},"width":129.63,"height":46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/1-3.png","element":"img","alt":"MN(ρ","inline":true},{"text":")], where ","element":"span"},{"style":{"height":11.2},"width":23,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/1-4.png","element":"img","alt":" ρ","inline":true,"padRight":true},{"text":"is the input state and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"M ","element":"span"},{"text":"is the measurement operator corresponding to passing the test. Tests of this type are ubiquitous in real-world physical setups used to learn the dynamics of quantum systems, see, e.g., Refs. [","element":"span"},{"href":"#id-2","referenceIndex":7,"text":"7","element":"a"},{"text":", ","element":"span"},{"href":"#id-3","referenceIndex":9,"text":"9","element":"a"},{"text":", ","element":"span"},{"href":"#id-16","referenceIndex":33,"text":"33","element":"a"},{"text":"–","element":"span"},{"href":"#id-17","referenceIndex":35,"text":"35","element":"a"},{"text":"]. Our task is to accurately predict values of quantities of the form ","element":"span"},{"style":{"height":18.4},"width":335.08,"height":46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/1-5.png","element":"img","alt":" f(x) = Tr[MN(ρ","inline":true},{"text":")] for test pairs ","element":"span"},{"style":{"height":17.6},"width":192.49,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/1-6.png","element":"img","alt":" x = (ρ, M","inline":true},{"text":"), and in this sense “learn” the evolution of the system.","element":"span"}],[{"text":"In the typical setting of supervised learning used to achieve our task of interest, there is a ","element":"span"},{"style":{"fontStyle":"italic"},"text":"training phase","element":"span"},{"text":", in which a set ","element":"span"},{"style":{"height":17.6},"width":224.49,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/1-7.png","element":"img","alt":" {(x, f(x))}x","inline":true,"padRight":true},{"text":"of pairs of tests and their passing probabilities is given to ","element":"span"},{"text":"the learner, and the learner uses these tests to form a hypothesis for the unknown channel. This hypothesis should accurately predict passing probabilities on new, unseen tests, typically drawn from the same distribution that generated the tests during training. Learning algorithms in this setting are analyzed within the framework of PAC learning [","element":"span"},{"href":"#id-18","referenceIndex":36,"text":"36","element":"a"},{"text":"]. In this work, we go beyond the setting of PAC learning. Instead of the data pairs (","element":"span"},{"style":{"fontStyle":"italic"},"text":"x, f","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"x","element":"span"},{"text":")) being given to the learner in a batch, we suppose that they are given to the learner sequentially, one by one, and perhaps even in an adaptive and adversarial manner. The learner must produce a hypothesis for the unknown channel at every step, using which they estimate the passing probability of the test pair. Upon learning the true passing probability, they can update their hypothesis. The goal now is for the learner to devise a sequence of hypotheses such that, over time, they make few mistakes in their estimates; see Section ","element":"span"},{"href":"#id-19","text":"1.1 ","element":"a"},{"text":"for a more formal description of this setting.","element":"span"}],[{"text":"The framework of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"online learning ","element":"span"},{"text":"[","element":"span"},{"href":"#id-20","referenceIndex":37,"text":"37","element":"a"},{"text":"] (see, e.g., Refs. [","element":"span"},{"href":"#id-21","referenceIndex":38,"text":"38","element":"a"},{"text":", Chapter 21] and [","element":"span"},{"href":"#id-22","referenceIndex":39,"text":"39","element":"a"},{"text":", Chapter 8] for pedagogical introductions) has been developed precisely to address this arguably more realistic learning setting. After all, data will often be processed sequentially, and it only makes sense to update the hypothesis step by step. Indeed, the importance of online learning derives from its ability to describe scenarios in which data is presented to the learner sequentially and adaptively. Thereby, it removes assumptions on the data-generating process, such as the i.i.d. assumption typical in PAC learning [","element":"span"},{"href":"#id-18","referenceIndex":36,"text":"36","element":"a"},{"text":"]. As such, it provides a more stringent type of learning compared to PAC learning. In fact, broadly speaking, it has been shown that online learning implies PAC learning [","element":"span"},{"href":"#id-23","referenceIndex":40,"text":"40","element":"a"},{"text":"–","element":"span"},{"href":"#id-24","referenceIndex":44,"text":"44","element":"a"},{"text":"]. Furthermore, if the learning algorithm is allowed unbounded computational resources, then something stronger holds: any concept class of Boolean functions is learnable in the online model if and only if is also learnable in the (distribution-free) PAC model [","element":"span"},{"href":"#id-25","referenceIndex":45,"text":"45","element":"a"},{"text":"].","element":"span"}],[{"text":"While online learning of quantum states has been considered already [","element":"span"},{"href":"#id-9","referenceIndex":20,"text":"20","element":"a"},{"text":", ","element":"span"},{"href":"#id-10","referenceIndex":21,"text":"21","element":"a"},{"text":"], and it has also been lifted to shadow tomography of quantum states with adaptively chosen observables [","element":"span"},{"href":"#id-7","referenceIndex":16,"text":"16","element":"a"},{"text":", ","element":"span"},{"href":"#id-26","referenceIndex":17,"text":"17","element":"a"},{"text":"], the overwhelming majority of results on quantum process learning so far do not allow for accurate predictions based on an adaptive choice of the state-measurement pairs. To fill this gap, we initiate the study of online learning for quantum processes. We first show that general quantum channels cannot be online learned with subexponential regret or number of mistakes. However, a priori knowledge about the complexity or the structure of the unknown channel can make online learning feasible. Indeed, we identify two physically relevant classes of channels—efficiently implementable channels and Pauli channels—that can be online learned with regret and mistake bounds scaling polynomially in the system size. We extend these results to classes of more general multi-time processes, in particular quantum processes that are non-Markovian, establishing that they, too, can be online learned with regret and mistake bounds scaling polynomially in the system size.","element":"span"}],[{"id":"id-19","style":{"fontWeight":"bold"},"text":"1.1 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Statement of the problem","element":"span"}],[{"text":"Consider an interaction between a learner and an adversary. At time step ","element":"span"},{"style":{"height":12.8},"width":106.7,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/2-0.png","element":"img","alt":" t ∈ N","inline":true},{"text":", the adversary picks a state-measurement pair, (","element":"span"},{"style":{"height":19.53},"width":168.02,"height":48.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/2-1.png","element":"img","alt":"ρ(t), M(t)","inline":true},{"text":"), where ","element":"span"},{"style":{"height":19.53},"width":60.97,"height":48.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/2-2.png","element":"img","alt":" ρ(t)","inline":true,"padRight":true},{"text":"is an input state and ","element":"span"},{"style":{"height":15.93},"width":85.5,"height":39.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/2-3.png","element":"img","alt":" M(t)","inline":true,"padRight":true},{"text":"is an effect operator of a two-outcome POVM. More generally, we can allow for states and measurements with arbitrary auxiliary systems, as in Figure ","element":"span"},{"href":"#id-15","text":"1","element":"a"},{"text":"(c) (see Section ","element":"span"},{"href":"#id-27","text":"2.4","element":"a"},{"text":"). The task for the learner is to predict Tr[","element":"span"},{"style":{"height":20.33},"width":208.17,"height":50.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/2-4.png","element":"img","alt":"M(t)N(ρ(t)","inline":true},{"text":")] for an unknown channel ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N","element":"span"},{"text":". To do so, the learner produces their own channel hypothesis, ","element":"span"},{"style":{"height":17.13},"width":80.64,"height":42.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/2-5.png","element":"img","alt":" N (t)","inline":true},{"text":", and outputs Tr[","element":"span"},{"style":{"height":20.33},"width":248.56,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/2-6.png","element":"img","alt":"M(t)N (t)(ρ(t)","inline":true},{"text":")] as their prediction. The adversary then provides ","element":"span"},{"text":"the learner with feedback on what would have been the correct expectation value","element":"span"},{"style":{"height":8.4},"width":17,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/3-0.png","element":"img","alt":"1","inline":true},{"text":". The goal of the learner is to ensure that their output values are not too far from correct in most rounds of the interaction. We can quantify the loss suffered by the learner at time step ","element":"span"},{"style":{"fontStyle":"italic"},"text":"t ","element":"span"},{"text":"by the absolute difference between learner’s estimate of the expectation value and the correct expectation value,","element":"span"}],[{"id":"id-34","style":{"width":"79%"},"width":1499,"height":75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/3-1.png","element":"img"}],[{"text":"The learner-adversary interaction proceeds for a total of ","element":"span"},{"style":{"height":12.8},"width":271.9,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/3-2.png","element":"img","alt":" T ∈ N rounds.","inline":true}],[{"text":"Now, how do we measure the learner’s performance over the course of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T ","element":"span"},{"text":"rounds of its interaction with the adversary? Note that due to the adversarial nature of the problem, statistical extrapolations are of little use. Indeed, as soon as the learner models the interaction using some probability distribution, the adversary can immediately change their strategy to make the learner fail. One way to gauge the learner’s performance is to compare their total loss at the end of the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T ","element":"span"},{"text":"rounds of interaction with the loss that they would have incurred if they were allowed to make all predictions at the end of the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T ","element":"span"},{"text":"rounds, after having seen all of the state-measurement pairs. We do this by considering the quantity","element":"span"}],[{"style":{"width":"76%"},"width":1441,"height":121,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/3-3.png","element":"img"}],[{"text":"where the minimization is over channels ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":"from some class of interest. The larger this quantity, the more the learner would lament their choice of hypotheses ","element":"span"},{"style":{"height":17.13},"width":80.64,"height":42.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/3-4.png","element":"img","alt":" N (t)","inline":true,"padRight":true},{"text":"at the end of the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T ","element":"span"},{"text":"rounds; hence, this quantity is called ","element":"span"},{"style":{"fontStyle":"italic"},"text":"regret","element":"span"},{"text":". It is well-known [","element":"span"},{"href":"#id-28","referenceIndex":46,"text":"46","element":"a"},{"text":"] that any online learner suffers Ω(","element":"span"},{"style":{"height":17.6},"width":67.36,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/3-5.png","element":"img","alt":"√T","inline":true},{"text":") regret in general. We aim for online learners that saturate this lower bound and achieve a regret scaling as ","element":"span"},{"style":{"height":19.21},"width":547.79,"height":48.02,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/3-6.png","element":"img","alt":"O(�Tpoly(log D)), where D","inline":true,"padRight":true},{"text":"is the dimension of the quantum system acted on by the channel.","element":"span"}],[{"text":"Another intuitive way to evaluate whether the learner’s performance is “good” is in terms of the number of rounds ","element":"span"},{"style":{"height":17.2},"width":112.35,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/3-7.png","element":"img","alt":" t ∈ [T","inline":true},{"text":"] in which the learner makes a mistake. By a “mistake”, we mean that the learner’s estimate of the expectation value is more than a given accuracy ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/3-8.png","element":"img","alt":" ε","inline":true,"padRight":true},{"text":"away (in absolute-value distance) from the correct expectation value revealed by the adversary. In other words, the learner should minimize the number of rounds in which ","element":"span"},{"style":{"height":20.34},"width":403.65,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/3-9.png","element":"img","alt":" ℓ(N (t), ρ(t), M(t)) > ε","inline":true},{"text":". More formally, we say that the learner makes an ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/3-10.png","element":"img","alt":" ε","inline":true},{"style":{"fontStyle":"italic"},"text":"-mistake ","element":"span"},{"text":"in round ","element":"span"},{"style":{"fontStyle":"italic"},"text":"t ","element":"span"},{"text":"if ","element":"span"},{"style":{"height":20.34},"width":407.45,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/3-11.png","element":"img","alt":" ℓ(N (t), ρ(t), M(t)) > ε","inline":true},{"text":". Viewed this way, the goal in online learning is to upper bound the number of ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/3-12.png","element":"img","alt":" ε","inline":true},{"text":"-mistakes for any number ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T ","element":"span"},{"text":"of rounds and any adversarial/adaptive choice of state-measurement pairs presented to the learner. This is the so-called mistake-bounded model of Littlestone [","element":"span"},{"href":"#id-20","referenceIndex":37,"text":"37","element":"a"},{"text":"]. In our work, we more specifically are interested in online learners that incur a mistake bound scaling logarithmically with ","element":"span"},{"style":{"fontStyle":"italic"},"text":"D ","element":"span"},{"text":"and inverse polynomially in the accuracy ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/3-13.png","element":"img","alt":" ε","inline":true,"padRight":true},{"text":"for, ideally, a low-degree polynomial.","element":"span"}],[{"id":"id-49","style":{"fontWeight":"bold"},"text":"1.2 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Overview of the main results","element":"span"}],[{"text":"For the task of online learning arbitrary ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit quantum states, it was shown in Ref. [","element":"span"},{"href":"#id-9","referenceIndex":20,"text":"20","element":"a"},{"text":"] that there exist procedures that make at most linearly-in-","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"many mistakes. (Here, for ease of presentation, we consider ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/3-14.png","element":"img","alt":" ε","inline":true,"padRight":true},{"text":"to be a constant, say 1","element":"span"},{"style":{"fontStyle":"italic"},"text":"/","element":"span"},{"text":"3, and then simply speak of a mistake instead of an ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/3-15.png","element":"img","alt":" ε","inline":true},{"text":"-mistake.)","element":"span"}],[{"id":"id-32","style":{"width":"88%"},"width":1650,"height":760,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/4-0.png","element":"img"}],[{"text":"Table 1: ","element":"figcaption","subtype":"caption"},{"style":{"fontWeight":"bold"},"text":"Overview of our main results. ","element":"figcaption","subtype":"caption"},{"text":"For channels of gate complexity ","element":"figcaption","subtype":"caption"},{"style":{"fontStyle":"italic"},"text":"G ","element":"figcaption","subtype":"caption"},{"text":"as well as for ","element":"figcaption","subtype":"caption"},{"style":{"fontStyle":"italic"},"text":"n","element":"figcaption","subtype":"caption"},{"text":"-qubit Pauli channels, we give regret and mistake upper bounds that scale favorably with ","element":"figcaption","subtype":"caption"},{"style":{"fontStyle":"italic"},"text":"G ","element":"figcaption","subtype":"caption"},{"text":"and ","element":"figcaption","subtype":"caption"},{"style":{"fontStyle":"italic"},"text":"n","element":"figcaption","subtype":"caption"},{"text":", respectively. Our bounds hold for general loss functions with Lipschitz constant ","element":"figcaption","subtype":"caption"},{"style":{"fontStyle":"italic"},"text":"L","element":"figcaption","subtype":"caption"},{"text":". Additionally, complementary mistake lower bounds show that the dependencies on ","element":"figcaption","subtype":"caption"},{"style":{"fontStyle":"italic"},"text":"G ","element":"figcaption","subtype":"caption"},{"text":"and ","element":"figcaption","subtype":"caption"},{"style":{"fontStyle":"italic"},"text":"n ","element":"figcaption","subtype":"caption"},{"text":"are almost optimal. Finally, we give computational complexity lower bounds for online learning either of the two classes of channels with polynomially many mistakes.","element":"figcaption","subtype":"caption"}],[{"text":"In terms of channel learning, this implies the same mistake bound for online learning arbitrary ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit state-preparation channels. However, as pointed out in Ref. [","element":"span"},{"href":"#id-6","referenceIndex":15,"text":"15","element":"a"},{"text":", Footnote 18] and as we formalize further in Section ","element":"span"},{"href":"#id-29","text":"4.1","element":"a"},{"text":", if the underlying concept class consists of all ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit unitary channels, then any (even computationally unbounded) online channel learner can be forced to make exponentially-in-","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"many mistakes. Thus, in contrast to the case of quantum states, we have to consider restricted classes of channels to achieve online channel learning with a polynomial-in-","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"number of ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/4-1.png","element":"img","alt":" ε","inline":true},{"text":"-mistakes. In fact, it was left open in Ref. [","element":"span"},{"href":"#id-6","referenceIndex":15,"text":"15","element":"a"},{"text":"] to find restricted classes of quantum channels for which this goal, which is an online version of “pretty good process tomography”, can be realized. While recent years have seen some progress on the batch version of this task, see, e.g., Refs. [","element":"span"},{"href":"#id-30","referenceIndex":47,"text":"47","element":"a"},{"text":"–","element":"span"},{"href":"#id-31","referenceIndex":50,"text":"50","element":"a"},{"text":"], the online case remains open.","element":"span"}],[{"text":"In this work, we answer the question of pretty good process tomography within the online learning framework for the following two concrete classes of channels (see Table ","element":"span"},{"href":"#id-32","text":"1 ","element":"a"},{"text":"for a summary of our results):","element":"span"}],[{"text":"1. Channels that can be implemented by dissipative quantum circuits with a limited number of local gates. Channels in this class can be regarded as having limited complexity.","element":"span"}],[{"text":"2. Pauli channels or, more generally, mixtures of a fixed set of (potentially exponentially many) ","element":"span"},{"style":{"fontStyle":"italic"},"text":"known ","element":"span"},{"text":"channels, each of which could have arbitrarily high gate complexity. This is a structural assumption on the channel.","element":"span"}],[{"text":"First, we show that there exists an online learner whose regret and number of mistakes can be ","element":"span"},{"id":"id-33","text":"controlled in terms of the gate complexity of the class of channels to be learned.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Theorem 1 ","element":"span"},{"text":"(Online learning channels of bounded complexity—informal)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"The class of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit channels that can be implemented by circuits consisting of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"arbitrary two-qubit channels can be online learned with regret bound ","element":"span"},{"style":{"height":28.8},"width":574.36,"height":72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/5-0.png","element":"img","alt":" O��TG log(Gn)�and with ε","inline":true},{"text":"-mistake bound ","element":"span"},{"style":{"height":28.8},"width":261.37,"height":72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/5-1.png","element":"img","alt":" O� G log(Gn)ε2 �.","inline":true}],[{"text":"Theorem ","element":"span"},{"href":"#id-33","text":"1 ","element":"a"},{"text":"extends beyond the absolute-value loss function from Equation (","element":"span"},{"href":"#id-34","text":"1.1","element":"a"},{"text":") to more general loss functions, in which case our bounds depend also on their Lipschitz constant ","element":"span"},{"style":{"fontStyle":"italic"},"text":"L","element":"span"},{"text":". General channels require exponentially many gates to be implemented by a 2-local circuit, so Theorem ","element":"span"},{"href":"#id-33","text":"1 ","element":"a"},{"text":"does not provide useful guarantees in this case. However, if we focus on the physically relevant class of channels with a polynomial gate complexity, then Theorem ","element":"span"},{"href":"#id-33","text":"1 ","element":"a"},{"text":"gives polynomial regret and mistake bounds. Additionally, we show with an ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(1)-mistake lower bound of Ω(min","element":"span"},{"style":{"height":19.94},"width":178.44,"height":49.86,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/5-2.png","element":"img","alt":"{2n,√G}","inline":true},{"text":") for general ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"and of Ω(","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":") for ","element":"span"},{"style":{"height":14.4},"width":119.04,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/5-3.png","element":"img","alt":" G ≤ n","inline":true,"padRight":true},{"text":"that the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":"-dependence in the online learning guarantees of Theorem ","element":"span"},{"href":"#id-33","text":"1 ","element":"a"},{"text":"cannot be significantly improved (Corollary ","element":"span"},{"href":"#id-35","text":"37","element":"a"},{"text":").","element":"span"}],[{"text":"Second, we prove regret and mistake bounds for Pauli channel online learning that scale efficiently in the number of qubits. Pauli channels play an important role in quantum information theory, specifically in the field of quantum computation, where Pauli channel noise either naturally emerges or is achievable via group twirls [","element":"span"},{"href":"#id-36","referenceIndex":51,"text":"51","element":"a"},{"text":"]. Thus, the following result establishes adaptive/online learnability of an important class of quantum noise channels.","element":"span"}],[{"id":"id-38","style":{"fontWeight":"bold"},"text":"Theorem 2 ","element":"span"},{"text":"(Online learning Pauli channels—informal)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"The class of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit Pauli channels can be online learned with regret bound ","element":"span"},{"style":{"height":28.8},"width":404.72,"height":72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/5-4.png","element":"img","alt":" O�√Tn�and with ε","inline":true},{"text":"-mistake bound ","element":"span"},{"style":{"height":28.8},"width":144.64,"height":72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/5-5.png","element":"img","alt":" O�nε2�.","inline":true}],[{"text":"The linear-in-","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"scaling in the ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/5-6.png","element":"img","alt":" ε","inline":true},{"text":"-mistake bound is optimal for constant ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/5-7.png","element":"img","alt":" ε","inline":true,"padRight":true},{"text":"(Corollary ","element":"span"},{"href":"#id-37","text":"39","element":"a"},{"text":"). Moreover, establishing a connection to important notions from classical learning theory, we demonstrate that Theorem ","element":"span"},{"href":"#id-38","text":"2 ","element":"a"},{"text":"yields bounds on the (sequential) fat-shattering dimension of Pauli channels, and gives rise to a sample compression scheme for Pauli channels (Section ","element":"span"},{"href":"#id-39","text":"3.4","element":"a"},{"text":").","element":"span"}],[{"text":"Theorems ","element":"span"},{"href":"#id-33","text":"1 ","element":"a"},{"text":"and ","element":"span"},{"href":"#id-38","text":"2 ","element":"a"},{"text":"give favorably scaling regret and mistake bounds. However, the respective online learning procedures are computationally inefficient. We show that, under reasonable cryptographic assumptions, this is unavoidable when aiming for good regret and mistake bounds in these online learning problems:","element":"span"}],[{"id":"id-40","style":{"fontWeight":"bold"},"text":"Theorem 3 ","element":"span"},{"text":"(Computational lower bounds for online learning—informal)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"On the one hand, any online learner that makes at most ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(poly(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":")) many (1","element":"span"},{"style":{"fontStyle":"italic"},"text":"/","element":"span"},{"text":"3)-mistakes in online learning ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit Pauli channels has to use runtime exponential in ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":". On the other hand, assuming that ","element":"span"},{"text":"RingLWE ","element":"span"},{"text":"cannot be solved by classical polynomial-time algorithms, then already for ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"= ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"polylog(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":")) there is no polynomial-time online learner that makes at most ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(poly(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":")) many (1","element":"span"},{"style":{"fontStyle":"italic"},"text":"/","element":"span"},{"text":"3)-mistakes in online learning ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":"-gate ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit quantum channels.","element":"span"}],[{"text":"While the computational inefficiency of our online learning procedures is of course undesirable, Theorem ","element":"span"},{"href":"#id-40","text":"3 ","element":"a"},{"text":"shows that this is not a flaw of our specific procedures. Rather, it is simply not possible to computationally efficiently learn Pauli channels with a good mistake bound. And given that ","element":"span"},{"text":"RingLWE ","element":"span"},{"text":"is widely believed to be hard [","element":"span"},{"href":"#id-41","referenceIndex":52,"text":"52","element":"a"},{"text":"–","element":"span"},{"href":"#id-42","referenceIndex":54,"text":"54","element":"a"},{"text":"], then as soon as ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"scales only slightly superlinearly in ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":", we also do not expect there to be any computationally efficient online learners that achieve a good regret for the class of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":"-gate channels.","element":"span"}],[{"text":"A problem related to online learning quantum processes is ","element":"span"},{"style":{"fontStyle":"italic"},"text":"shadow tomography ","element":"span"},{"text":"of quantum processes. By analogy with shadow tomography of quantum states [","element":"span"},{"href":"#id-7","referenceIndex":16,"text":"16","element":"a"},{"text":", ","element":"span"},{"href":"#id-26","referenceIndex":17,"text":"17","element":"a"},{"text":"], shadow tomography of quantum processes is a stricter form of learning than mistake-bounded online learning: while in the online learning task the number of mistakes should be bounded, in shadow tomography one has to correctly estimate (i.e., with error ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/6-0.png","element":"img","alt":" ε","inline":true},{"text":") the expectation values of all the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"M ","element":"span"},{"text":"observables provided, with probability at least 1 ","element":"span"},{"style":{"height":12.8},"width":63.66,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/6-1.png","element":"img","alt":" − δ","inline":true},{"text":". (See Problem ","element":"span"},{"href":"#id-43","text":"3 ","element":"a"},{"text":"for a formal statement of the problem.) Furthermore, in shadow tomography, only quantum access to the channel is provided, while classical descriptions of tests and their passing probabilities (with respect to the unknown channel) are provided in online learning. The main observation underlying our proof of Theorem ","element":"span"},{"href":"#id-38","text":"2 ","element":"a"},{"text":"implies that techniques from classical adaptive data analysis [","element":"span"},{"href":"#id-44","referenceIndex":55,"text":"55","element":"a"},{"text":", ","element":"span"},{"href":"#id-45","referenceIndex":56,"text":"56","element":"a"},{"text":"] directly carry over to Pauli channel shadow tomography. In particular, we obtain the following result.","element":"span"}],[{"id":"id-46","style":{"fontWeight":"bold"},"text":"Theorem 4 ","element":"span"},{"text":"(Pauli channel shadow tomography—informal)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Shadow tomography of an arbitrary ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit Pauli channel can be solved using","element":"span"}],[{"id":"id-48","style":{"width":"67%"},"width":1273,"height":120,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/6-2.png","element":"img"}],[{"text":"copies of the channel. The strategy runs in time poly(4","element":"span"},{"style":{"height":15.2},"width":64.96,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/6-3.png","element":"img","alt":"n, k","inline":true},{"text":") per channel use.","element":"span"}],[{"text":"We leverage Theorem ","element":"span"},{"href":"#id-46","text":"4 ","element":"a"},{"text":"to make a more general statement about shadow tomography of arbitrary channels. In particular, we show in Corollary ","element":"span"},{"href":"#id-47","text":"49 ","element":"a"},{"text":"that we can solve the shadow tomography problem for an arbitrary quantum channel ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":"with a number of copies scaling as in (","element":"span"},{"href":"#id-48","text":"1.3","element":"a"},{"text":"), with ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/6-4.png","element":"img","alt":" ε","inline":true,"padRight":true},{"text":"therein replaced by ","element":"span"},{"style":{"height":21.29},"width":1004.69,"height":53.23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/6-5.png","element":"img","alt":" ε − 12∥N − N P∥⋄, for all ε > 12∥N − N P∥⋄. Here N P","inline":true,"padRight":true},{"text":"is the Pauli-twirled version of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N","element":"span"},{"text":".","element":"span"}],[{"id":"id-96","style":{"fontWeight":"bold"},"text":"1.3 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Extensions of our results","element":"span"}],[{"text":"Many of the techniques underlying our main results in Section ","element":"span"},{"href":"#id-49","text":"1.2 ","element":"a"},{"text":"can be applied to more general settings, going beyond ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":"-gate channels and Pauli channels. We now summarize these extensions.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Convex mixtures of known channels. ","element":"span"},{"text":"While Theorem ","element":"span"},{"href":"#id-38","text":"2 ","element":"a"},{"text":"is phrased for Pauli channels, we in fact show the following more general statement: If we consider a class of channels that can be written as probabilistic mixtures of a fixed set of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"K ","element":"span"},{"text":"known quantum channels, then we can achieve online learning for this class with regret and number of mistakes bounded in terms of log(","element":"span"},{"style":{"fontStyle":"italic"},"text":"K","element":"span"},{"text":"). Notably, these bounds apply even if channels with high circuit complexity occur in the mixtures.","element":"span"}],[{"id":"id-53","style":{"fontWeight":"bold"},"text":"Theorem 5 ","element":"span"},{"text":"(Online learning convex mixtures of known channels—informal)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Given an arbitrary set of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"K > ","element":"span"},{"text":"0 known and fixed quantum channels, any convex mixture of these channels can be online learned with regret bound ","element":"span"},{"style":{"height":19.2},"width":379.12,"height":48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/6-6.png","element":"img","alt":" O�√T log K� and ε","inline":true},{"text":"-mistake bound ","element":"span"},{"style":{"height":24.5},"width":192.26,"height":61.24,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/6-7.png","element":"img","alt":" O(log(K)ε2 ).","inline":true}],[{"style":{"fontWeight":"bold"},"text":"Adaptive tests of channels. ","element":"span"},{"text":"We may also want to learn the passing probabilities of more general channel tests that make use of the channel multiple times, perhaps adaptively. Such tests have the form shown in Figure ","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":"(a). We can directly import results about convex mixtures of known","element":"span"}],[{"id":"id-50","style":{"width":"80%"},"width":1514,"height":696,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/7-0.png","element":"img"}],[{"text":"Figure 2: ","element":"figcaption","subtype":"caption"},{"style":{"fontWeight":"bold"},"text":"Extensions of our results to general quantum processes. ","element":"figcaption","subtype":"caption"},{"text":"(a) Going beyond one use of a channel ","element":"figcaption","subtype":"caption"},{"style":{"fontStyle":"italic"},"text":"N","element":"figcaption","subtype":"caption"},{"text":", as shown in Figure ","element":"figcaption","subtype":"caption"},{"href":"#id-15","text":"1","element":"a","subtype":"caption"},{"text":", we may want to learn the value of the channel on tests that make multiple, adaptive uses of the channel. Shown are three independent uses of ","element":"figcaption","subtype":"caption"},{"style":{"fontStyle":"italic"},"text":"N","element":"figcaption","subtype":"caption"},{"text":", whose Choi representation is ","element":"figcaption","subtype":"caption"},{"style":{"fontStyle":"italic"},"text":"C","element":"figcaption","subtype":"caption"},{"text":"(","element":"figcaption","subtype":"caption"},{"style":{"fontStyle":"italic"},"text":"N","element":"figcaption","subtype":"caption"},{"text":"). (b) We can similarly perform adaptive tests of a non-Markovian process ","element":"figcaption","subtype":"caption"},{"style":{"height":14.98},"width":72.45,"height":37.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/7-1.png","element":"img","alt":" N [3]","inline":true},{"text":", characterized by the blue quantum comb, with Choi representation ","element":"figcaption","subtype":"caption"},{"style":{"height":18.19},"width":118.97,"height":45.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/7-2.png","element":"img","alt":"C(N [3]","inline":true},{"text":"). The generalized Born rule [","element":"figcaption","subtype":"caption"},{"href":"#id-51","referenceIndex":57,"text":"57","element":"a","subtype":"caption"},{"text":"] tells us that the outcome probabilities of measurements, or “tests”, of quantum channels and multi-time quantum processes can be determined by an analogue of the usual Born rule for quantum states, in which the Choi representation takes the place of the quantum state, and the test is characterized by operators ","element":"figcaption","subtype":"caption"},{"style":{"height":14.18},"width":67.44,"height":35.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/7-3.png","element":"img","alt":" E(i) ","inline":true,"padRight":true},{"text":"that are generalizations of effect operators for quantum states. (See Section ","element":"figcaption","subtype":"caption"},{"href":"#id-52","text":"2.1 ","element":"a","subtype":"caption"},{"text":"for details.)","element":"figcaption","subtype":"caption"}],[{"text":"channels to this setting. Indeed, given a channel ","element":"span"},{"style":{"height":23.29},"width":300.17,"height":58.23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/7-4.png","element":"img","alt":" N = �Kj=1 pjNj","inline":true,"padRight":true},{"text":"that is a convex mixture of known ","element":"span"},{"text":"channels ","element":"span"},{"style":{"height":19.42},"width":50.8,"height":48.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/7-5.png","element":"img","alt":" Nj","inline":true},{"text":", it holds that","element":"span"}],[{"style":{"width":"77%"},"width":1445,"height":128,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/7-6.png","element":"img"}],[{"text":"This is itself a convex mixture of ","element":"span"},{"style":{"height":15.14},"width":58.18,"height":37.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/7-7.png","element":"img","alt":" Kk","inline":true,"padRight":true},{"text":"known channels, so Theorem ","element":"span"},{"href":"#id-53","text":"5 ","element":"a"},{"text":"applies and yields regret and mistake bounds scaling with ","element":"span"},{"style":{"fontStyle":"italic"},"text":"k ","element":"span"},{"text":"log(","element":"span"},{"style":{"fontStyle":"italic"},"text":"K","element":"span"},{"text":"). We note, however, that Theorem ","element":"span"},{"href":"#id-53","text":"5 ","element":"a"},{"text":"in this scenario will generally not give a ","element":"span"},{"style":{"fontStyle":"italic"},"text":"proper ","element":"span"},{"text":"online learner. Namely, the learner’s hypotheses will be (","element":"span"},{"style":{"fontStyle":"italic"},"text":"nk","element":"span"},{"text":")-qubit channels given by convex combinations of ","element":"span"},{"style":{"height":22.58},"width":608.14,"height":56.45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/7-8.png","element":"img","alt":" {Nj1 ⊗ Nj2 ⊗ · · · ⊗ Njk}Kj1,...,jk=1","inline":true},{"text":", which in general cannot ","element":"span"},{"text":"be factorized into ","element":"span"},{"style":{"fontStyle":"italic"},"text":"k ","element":"span"},{"text":"copies of a single ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit channel.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Non-Markovian quantum processes. ","element":"span"},{"text":"At the heart of the quantities that we aim to estimate is Born’s rule [","element":"span"},{"href":"#id-54","referenceIndex":58,"text":"58","element":"a"},{"text":"], which tells us that the expected value of an observable (Hermitian operator) ","element":"span"},{"style":{"fontStyle":"italic"},"text":"H ","element":"span"},{"text":"for a quantum state ","element":"span"},{"style":{"height":11.2},"width":23,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/7-9.png","element":"img","alt":" ρ","inline":true,"padRight":true},{"text":"is given by Tr[","element":"span"},{"style":{"height":15.6},"width":62.8,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/7-10.png","element":"img","alt":"Hρ","inline":true},{"text":"]. The Born rule generalizes not only to quantum channels but also to multi-time/non-Markovian quantum processes [","element":"span"},{"href":"#id-51","referenceIndex":57,"text":"57","element":"a"},{"text":", ","element":"span"},{"href":"#id-55","referenceIndex":59,"text":"59","element":"a"},{"text":"], where we model multi-time processes mathematically as “quantum combs” [","element":"span"},{"href":"#id-51","referenceIndex":57,"text":"57","element":"a"},{"text":"], also called “quantum strategies” [","element":"span"},{"href":"#id-56","referenceIndex":60,"text":"60","element":"a"},{"text":"]; see also Refs. [","element":"span"},{"href":"#id-57","referenceIndex":61,"text":"61","element":"a"},{"text":"–","element":"span"},{"href":"#id-58","referenceIndex":63,"text":"63","element":"a"},{"text":"]. Within the framework of quantum combs, the Choi representation of the process takes the place of the state ","element":"span"},{"style":{"height":11.2},"width":23,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/7-11.png","element":"img","alt":" ρ","inline":true,"padRight":true},{"text":"and the observable ","element":"span"},{"style":{"fontStyle":"italic"},"text":"H ","element":"span"},{"text":"is replaced by a generalized “process observable” ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"; see Figure ","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":"(b). We provide formal definitions of quantum combs and process observables in Section ","element":"span"},{"href":"#id-52","text":"2.1","element":"a"},{"text":". Consequently, we can readily translate our results on bounded complexity channels and ","element":"span"},{"text":"convex mixtures of known channels to bounded complexity non-Markovian processes and convex mixtures of known non-Markovian processes. While full tomography of multi-time/non-Markovian processes has been considered [","element":"span"},{"href":"#id-59","referenceIndex":64,"text":"64","element":"a"},{"text":"], to the best of our knowledge, the restricted tomographic setting that we consider here has so far not been considered for multi-time quantum processes.","element":"span"}],[{"text":"We start by considering multi-time processes of bounded complexity. We formally define these processes by analogy with quantum channels of bounded gate complexity in Section ","element":"span"},{"href":"#id-60","text":"3.3","element":"a"},{"text":". We can then extend Theorem ","element":"span"},{"href":"#id-33","text":"1 ","element":"a"},{"text":"as follows.","element":"span"}],[{"id":"id-87","style":{"fontWeight":"bold"},"text":"Theorem 6 ","element":"span"},{"text":"(Online learning multi-time processes of bounded complexity—informal)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"The class of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit multi-time processes with complexity parameter ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"can be online learned with regret bound","element":"span"}],[{"style":{"width":"60%"},"width":1132,"height":66,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/8-0.png","element":"img"}],[{"text":"We also extend Theorem ","element":"span"},{"href":"#id-53","text":"5 ","element":"a"},{"text":"to an online learning result for convex mixtures of arbitrary known multi-time processes.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Theorem 7 ","element":"span"},{"text":"(Online learning of convex mixtures of known multi-time processes—informal)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Given an arbitrary set of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"K > ","element":"span"},{"text":"0 known and fixed quantum multi-time processes, any convex mixture of these processes can be online learned with regret bound ","element":"span"},{"style":{"height":19.2},"width":258.46,"height":48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/8-1.png","element":"img","alt":" O�√T log K�","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/8-2.png","element":"img","alt":" ε","inline":true},{"text":"-mistake bound ","element":"span"},{"style":{"height":24.49},"width":192.27,"height":61.24,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/8-3.png","element":"img","alt":"O(log(K)ε2 ).","inline":true}],[{"text":"Finally, we extend the shadow tomography result of Theorem ","element":"span"},{"href":"#id-46","text":"4 ","element":"a"},{"text":"to arbitrary multi-time processes. Here, the shadow tomography problem for multi-time processes is defined analogously as for channels; we refer to Problem ","element":"span"},{"href":"#id-61","text":"4 ","element":"a"},{"text":"for the formal problem statement.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Theorem 8 ","element":"span"},{"text":"(Shadow tomography of multi-time processes—informal)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Shadow tomography of an arbitrary ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit multi-time process with ","element":"span"},{"style":{"height":17.6},"width":260.55,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/8-4.png","element":"img","alt":" r ∈ {1, 2, . . . }","inline":true,"padRight":true},{"text":"time steps can be solved to accuracy","element":"span"}],[{"style":{"width":"99%"},"width":1872,"height":206,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/8-5.png","element":"img"}],[{"text":"copies of the process, where ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":"is the Choi representation of the process, ","element":"span"},{"style":{"height":15.13},"width":61.82,"height":37.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/8-6.png","element":"img","alt":" NP ","inline":true,"padRight":true},{"text":"is the Choi representation of the Pauli-twirled version of the process, and ","element":"span"},{"style":{"height":17.6},"width":87.69,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/8-7.png","element":"img","alt":" ∥·∥⋄r","inline":true,"padRight":true},{"text":"is the strategy ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r","element":"span"},{"text":"-norm.","element":"span"}],[{"id":"id-97","style":{"fontWeight":"bold"},"text":"1.4 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Related work","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Online learning of quantum states. ","element":"span"},{"text":"Ref. [","element":"span"},{"href":"#id-9","referenceIndex":20,"text":"20","element":"a"},{"text":"] introduced the problem of online learning quantum states and proposed three conceptually different approaches that all achieve mistake bounds scaling linearly in the system size. Recently, Ref. [","element":"span"},{"href":"#id-10","referenceIndex":21,"text":"21","element":"a"},{"text":"] investigated an adaptive variant of this online learning problem, in which the underlying state may change over time. A notable application of online state learning is its use as a subroutine in recent shadow tomography protocols [","element":"span"},{"href":"#id-7","referenceIndex":16,"text":"16","element":"a"},{"text":", ","element":"span"},{"href":"#id-26","referenceIndex":17,"text":"17","element":"a"},{"text":", ","element":"span"},{"href":"#id-62","referenceIndex":65,"text":"65","element":"a"},{"text":"].","element":"span"}],[{"text":"A natural question to ask is whether we can simply apply known results on online learning quantum states to the Choi state of the unknown channel to accomplish the channel online learning task laid out above. ","element":"span"},{"text":"This is certainly possible; however, we aim to predict quantities of the form Tr[","element":"span"},{"style":{"height":18.4},"width":260.48,"height":46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/9-0.png","element":"img","alt":"MBNA→B(ρA","inline":true},{"text":")]. And while Choi states are efficiently (online) learnable, the so-called “Choi-to-channel” translation incurs a dimension factor:","element":"span"}],[{"id":"id-116","style":{"width":"82%"},"width":1537,"height":61,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/9-1.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":24.06},"width":91.97,"height":60.16,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/9-2.png","element":"img","alt":" CNA,B ","inline":true,"padRight":true},{"text":"is the Choi ","element":"span"},{"style":{"fontStyle":"italic"},"text":"matrix ","element":"span"},{"text":"of ","element":"span"},{"style":{"height":24.5},"width":364.98,"height":61.24,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/9-3.png","element":"img","alt":" N, ΦNA,B = 1dA CNA,B ","inline":true,"padRight":true},{"text":"is the Choi ","element":"span"},{"style":{"fontStyle":"italic"},"text":"state ","element":"span"},{"text":"of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N","element":"span"},{"text":", and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"B ","element":"span"},{"text":"are the ","element":"span"},{"text":"input and output systems, respectively, of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N","element":"span"},{"text":"; see Section ","element":"span"},{"href":"#id-52","text":"2.1 ","element":"a"},{"text":"for formal definitions. Because of the dimension factor in the right-most equality above, we have","element":"span"}],[{"style":{"width":"83%"},"width":1573,"height":61,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/9-4.png","element":"img"}],[{"text":"Consequently, good regret and mistake bounds for online learning the Choi state do not give rise to good regret and mistake bounds for online learning the channel. In Section ","element":"span"},{"href":"#id-63","text":"2.5","element":"a"},{"text":", we discuss this and further issues arising when merely applying quantum state learning algorithms from Ref. [","element":"span"},{"href":"#id-9","referenceIndex":20,"text":"20","element":"a"},{"text":"] (such as the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"matrix multiplicative weights ","element":"span"},{"text":"(MMW) algorithm) to the Choi state for channel online learning. In particular, noting that the MMW algorithm for online learning quantum states cannot be directly applied to the learning of Choi states (due to the fact that Choi states have an additional partial trace requirement), we provide a projected MMW algorithm that can be used to learn the Choi state, along with the associated regret bound analysis.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Learning channels of bounded complexity. ","element":"span"},{"text":"With general quantum channels being impossible to learn in many scenarios, recent work has investigated the learnability of channels with bounded gate complexity in different settings. In variational quantum machine learning, a variety of works derived bounds on learning-theoretic complexity measures, and hence the sample complexity sufficient for good PAC generalization bounds, in terms of the number of gates [","element":"span"},{"href":"#id-64","referenceIndex":48,"text":"48","element":"a"},{"text":"–","element":"span"},{"href":"#id-31","referenceIndex":50,"text":"50","element":"a"},{"text":", ","element":"span"},{"href":"#id-65","referenceIndex":66,"text":"66","element":"a"},{"text":"–","element":"span"},{"href":"#id-66","referenceIndex":69,"text":"69","element":"a"},{"text":"]. For learning classical-to-quantum mappings, Refs. [","element":"span"},{"href":"#id-67","referenceIndex":70,"text":"70","element":"a"},{"text":"–","element":"span"},{"href":"#id-68","referenceIndex":72,"text":"72","element":"a"},{"text":"] gave similar-in-spirit sample complexity bounds derived from gate complexity assumptions. Finally, Refs. [","element":"span"},{"href":"#id-13","referenceIndex":25,"text":"25","element":"a"},{"text":", ","element":"span"},{"href":"#id-14","referenceIndex":32,"text":"32","element":"a"},{"text":"] considered different scenarios of state and process learning under assumptions of limited gate complexity.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Learning Pauli channels. ","element":"span"},{"text":"Pauli channel learning has been considered in different (mostly nononline) scenarios. Ref. [","element":"span"},{"href":"#id-69","referenceIndex":73,"text":"73","element":"a"},{"text":"] gave procedures for approximating the Pauli error rates of a general unknown Pauli channel in different ","element":"span"},{"style":{"height":8},"width":36.18,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/9-5.png","element":"img","alt":" ℓp","inline":true,"padRight":true},{"text":"norms, with a recent improvement for the ","element":"span"},{"style":{"height":6},"width":52.18,"height":15,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/9-6.png","element":"img","alt":" ℓ∞","inline":true,"padRight":true},{"text":"norm in Ref. [","element":"span"},{"href":"#id-70","referenceIndex":74,"text":"74","element":"a"},{"text":"]. Ref. [","element":"span"},{"href":"#id-71","referenceIndex":75,"text":"75","element":"a"},{"text":"] proved that the query complexities achieved by these procedures are optimal among non-adaptive incoherent strategies, and also gave lower bounds for adaptive incoherent strategies. If the Pauli noise is known to have a local structure, the query complexity can be improved beyond the results of Refs. [","element":"span"},{"href":"#id-69","referenceIndex":73,"text":"73","element":"a"},{"text":", ","element":"span"},{"href":"#id-72","referenceIndex":76,"text":"76","element":"a"},{"text":"], even if the conditional independence structure is not known in advance [","element":"span"},{"href":"#id-73","referenceIndex":77,"text":"77","element":"a"},{"text":"]. Refs. [","element":"span"},{"href":"#id-74","referenceIndex":78,"text":"78","element":"a"},{"text":"–","element":"span"},{"href":"#id-75","referenceIndex":80,"text":"80","element":"a"},{"text":"] highlight the importance of auxiliary systems and entanglement in learning the eigenvalues of an unknown Pauli channel. In Ref. [","element":"span"},{"href":"#id-76","referenceIndex":81,"text":"81","element":"a"},{"text":"], the authors provide an online algorithm for learning the eigenvalues of a Pauli channel","element":"span"},{"style":{"height":8.8},"width":17,"height":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/9-7.png","element":"img","alt":"2","inline":true},{"text":", while in this work we consider the task of learning the error rates of Pauli channels. Finally, Ref. [","element":"span"},{"href":"#id-77","referenceIndex":29,"text":"29","element":"a"},{"text":"] investigates the more general task of learning the Pauli transfer matrix of a general channel, giving both non-adaptive and adaptive procedures.","element":"span"}],[{"id":"id-98","style":{"fontWeight":"bold"},"text":"1.5 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Techniques and proof overview","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Sequential covering numbers. ","element":"span"},{"text":"In our proof of Theorem ","element":"span"},{"href":"#id-33","text":"1","element":"a"},{"text":", we combine tools from two recent lines of work in classical online learning and quantum machine learning. On the one hand, we rely on regret bounds for online learning in terms of sequential complexity measures for the underlying hypothesis class. In particular, we use regret bounds via sequential covering numbers [","element":"span"},{"href":"#id-78","referenceIndex":82,"text":"82","element":"a"},{"text":", ","element":"span"},{"href":"#id-79","referenceIndex":83,"text":"83","element":"a"},{"text":"], which can be viewed as a sequential version of Dudley’s theorem. Namely, as we recall in Theorem ","element":"span"},{"href":"#id-80","text":"11","element":"a"},{"text":",","element":"span"}],[{"style":{"width":"91%"},"width":1712,"height":122,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/10-0.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":17.6},"width":172.48,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/10-1.png","element":"img","alt":" NT (F, β,","inline":true,"padRight":true},{"text":"2) denotes the worst-case sequential 2-norm ","element":"span"},{"style":{"height":15.6},"width":26,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/10-2.png","element":"img","alt":" β","inline":true},{"text":"-covering number of ","element":"span"},{"style":{"height":19.53},"width":299.51,"height":48.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/10-3.png","element":"img","alt":" F ⊆ [0, 1]X over","inline":true,"padRight":true},{"text":"all complete binary trees ","element":"span"},{"style":{"fontWeight":"bold"},"text":"x ","element":"span"},{"text":"of depth ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T ","element":"span"},{"text":"whose nodes are labeled by elements of ","element":"span"},{"text":"X","element":"span"},{"text":". On the other hand, for the class of channels with a given gate complexity ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":", we invoke covering number bounds with respect to the diamond norm distance [","element":"span"},{"href":"#id-13","referenceIndex":25,"text":"25","element":"a"},{"text":", ","element":"span"},{"href":"#id-14","referenceIndex":32,"text":"32","element":"a"},{"text":", ","element":"span"},{"href":"#id-81","referenceIndex":68,"text":"68","element":"a"},{"text":"], and we demonstrate that these also control sequential covering numbers. Concretely, in Corollary ","element":"span"},{"href":"#id-82","text":"19","element":"a"},{"text":", we show that","element":"span"}],[{"style":{"width":"70%"},"width":1317,"height":130,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/10-4.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/10-5.png","element":"img","alt":" CPTPn,G","inline":true,"padRight":true},{"text":"denotes the class of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit channels with gate complexity ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":". Together, these results imply the regret bound from Theorem ","element":"span"},{"href":"#id-33","text":"1","element":"a"},{"text":". This in turn leads to the claimed mistake bound via a standard argument (see Lemma ","element":"span"},{"href":"#id-83","text":"12","element":"a"},{"text":").","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Online convex optimization. ","element":"span"},{"text":"Theorem ","element":"span"},{"href":"#id-38","text":"2 ","element":"a"},{"text":"is based on efficient mistake-bounded learning of a convex mixture of exponentially many ","element":"span"},{"style":{"fontStyle":"italic"},"text":"known ","element":"span"},{"text":"channels. A simple but crucial observation is that the only unknown about such channels is a classical probability distribution ","element":"span"},{"style":{"fontStyle":"italic","fontWeight":"bold"},"text":"p ","element":"span"},{"text":"(for example, in the special case of Pauli channels, the Pauli error rate distribution). Consequently, online learning of these channels corresponds to online learning of ","element":"span"},{"style":{"fontStyle":"italic","fontWeight":"bold"},"text":"p ","element":"span"},{"text":"when the input states and the measurements (on the output state evolved by the unknown channel) are adversarially revealed to the learner. We show that this task can be achieved via an alternative online learning scenario, in which any adversarially chosen state and measurement can be encoded in a ‘channel observable’ ","element":"span"},{"style":{"height":26.85},"width":232.57,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/10-6.png","element":"img","alt":" E(t)A,B, which","inline":true,"padRight":true},{"text":"when revealed to the learner is associated to a “challenge” vector ","element":"span"},{"style":{"height":19.93},"width":260.23,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/10-7.png","element":"img","alt":" m(t) given by","inline":true}],[{"style":{"width":"75%"},"width":1421,"height":67,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/10-8.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":26.85},"width":242.89,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/10-9.png","element":"img","alt":" E(t)A,B = (ρ(t)A","inline":true,"padRight":true},{"text":")","element":"span"},{"style":{"height":24.05},"width":162.03,"height":60.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/10-10.png","element":"img","alt":"T ⊗ M(t)B","inline":true,"padRight":true},{"text":", and Γ","element":"span"},{"style":{"height":22.14},"width":60.78,"height":55.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/10-11.png","element":"img","alt":"z,xA,B","inline":true,"padRight":true},{"text":"is the Choi representation of the Pauli unitary channel ","element":"span"},{"style":{"height":18.74},"width":304.66,"height":46.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/10-12.png","element":"img","alt":"ρ �→ P z,xρP z,x†","inline":true},{"text":". The learner produces hypotheses ","element":"span"},{"style":{"height":19.94},"width":64.63,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/10-13.png","element":"img","alt":" p(t)","inline":true,"padRight":true},{"text":"of the unknown probability distribution at times ","element":"span"},{"style":{"height":17.6},"width":324.16,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/10-14.png","element":"img","alt":" t ∈ {1, 2, . . . , T}","inline":true},{"text":". ","element":"span"},{"text":"The learner’s hypotheses should be such that, for ","element":"span"},{"style":{"height":17.6},"width":289.02,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/10-15.png","element":"img","alt":" T ∈ {1, 2, . . . }","inline":true},{"text":", ","element":"span"},{"style":{"height":20.49},"width":288.61,"height":51.22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/10-16.png","element":"img","alt":"�Tt=1 m(t) · p(t)","inline":true,"padRight":true},{"text":"is not too different from ","element":"span"},{"style":{"height":20.49},"width":250.98,"height":51.22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/10-17.png","element":"img","alt":" �Tt=1 m(t) · p","inline":true},{"text":". Using known guarantees [","element":"span"},{"href":"#id-84","referenceIndex":84,"text":"84","element":"a"},{"text":"], the difference ","element":"span"},{"text":"between these two sums can be shown to scale only logarithmically with the size of ","element":"span"},{"style":{"fontStyle":"italic","fontWeight":"bold"},"text":"p","element":"span"},{"text":"’s support for the multiplicative weights update method. This implies regret bounds, and therefore mistake bounds, that scale only linearly in the number of qubits.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Mistake lower bounds. ","element":"span"},{"text":"We prove our lower bounds by embedding classical online learning problems into the quantum tasks of interest, thereby inheriting classical lower bounds. First, it is easy to embed the task of online learning a general function ","element":"span"},{"style":{"height":19.13},"width":411.65,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/11-0.png","element":"img","alt":" f : {0, 1}n−1 → {0, 1}","inline":true,"padRight":true},{"text":"into that of online learning a general ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit unitary. (Alternatively, when allowing for non-unitary channels, we give such an embedding into (","element":"span"},{"style":{"height":7.6},"width":69.91,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/11-1.png","element":"img","alt":"n −","inline":true,"padRight":true},{"text":"1)-qubit channels.) Thus, the folklore mistake lower bound of Ω(2","element":"span"},{"style":{"height":5.6},"width":21,"height":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/11-2.png","element":"img","alt":"n","inline":true},{"text":") for the former problem immediately carries over to the latter, showing that the class of all ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit unitaries cannot be online learned with a sub-exponential number of mistakes.","element":"span"}],[{"text":"Second, to prove mistake lower bounds for learning channels of gate complexity ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":", we use the fact that ","element":"span"},{"style":{"height":19.53},"width":93.52,"height":48.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/11-3.png","element":"img","alt":" O(2k","inline":true},{"text":") many gates suffice to implement an arbitrary function ","element":"span"},{"style":{"height":19.53},"width":385.02,"height":48.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/11-4.png","element":"img","alt":" f : {0, 1}k → {0, 1}","inline":true,"padRight":true},{"text":"with a classical circuit. We can quantumly realize such a circuit either by first measuring in the computational basis and then applying the classical circuit, or by implementing irreversible AND and OR gates with reversible Toffoli gates, where the auxiliary system gets reset to a suitable value after every gate. Both constructions demonstrate that the class of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit channels of gate complexity ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"contains the class of all Boolean functions on the first ","element":"span"},{"style":{"height":17.6},"width":249.56,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/11-5.png","element":"img","alt":" q = min{log2","inline":true},{"text":"(Θ(","element":"span"},{"style":{"height":17.6},"width":212.69,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/11-6.png","element":"img","alt":"G)), n − 1}","inline":true,"padRight":true},{"text":"inputs (extended to the remaining subsystems by a trivial action). So, the classical folklore lower bound gives an Ω(2","element":"span"},{"style":{"height":17.6},"width":356.82,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/11-7.png","element":"img","alt":"q) = Ω(min{2n, G}","inline":true},{"text":") mistake bound here.","element":"span"}],[{"text":"Third, by associating a general function ","element":"span"},{"style":{"height":17.6},"width":640.57,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/11-8.png","element":"img","alt":" f : {1, . . . , n} → {0, 1} with the n","inline":true},{"text":"-qubit Pauli channel ","element":"span"},{"style":{"height":19.64},"width":54.8,"height":49.11,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/11-9.png","element":"img","alt":"Nf","inline":true,"padRight":true},{"text":"given by ","element":"span"},{"style":{"height":28.8},"width":719.34,"height":72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/11-10.png","element":"img","alt":" Nf(ρ) =��ni=1 Zf(i)i �ρ��ni=1 Zf(i)i �","inline":true},{"text":", we prove that online learning ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit Pauli channels is at least as hard as online learning a general ","element":"span"},{"style":{"fontStyle":"italic"},"text":"{","element":"span"},{"text":"0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1","element":"span"},{"style":{"fontStyle":"italic"},"text":"}","element":"span"},{"text":"-valued function on ","element":"span"},{"style":{"height":17.6},"width":156.52,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/11-11.png","element":"img","alt":" ⌊log(n)⌋","inline":true,"padRight":true},{"text":"bits. Hence, the classical mistake lower bound for online learning arbitrary functions becomes a Ω(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":") mistake lower bound for Pauli channel online learning.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Computational complexity lower bounds. ","element":"span"},{"text":"Our exponential computational complexity lower bound for online learning Pauli channels with a polynomial mistake bound is a simple formalization of the following intuition: The channel observables posed as challenges by the adversary are exponentially-sized objects, and a successful online learner has to process all the exponentially many entries. We give a simple adversary demonstrating that this intuition is correct, and applies even if the adversary provides the channel observables already in the basis expansion that is most natural for Pauli channels, namely the (unnormalized) ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit Bell basis.","element":"span"}],[{"text":"To prove our computational complexity lower bound for online learning bounded-complexity channels, we require a hardness assumption. Here, we consider the so-called ","element":"span"},{"style":{"fontStyle":"italic"},"text":"ring learning with errors ","element":"span"},{"text":"(LWE) problem [","element":"span"},{"href":"#id-85","referenceIndex":85,"text":"85","element":"a"},{"text":"] (","element":"span"},{"text":"RingLWE","element":"span"},{"text":"), which underlies much of lattice-based cryptography. To establish that hardness of ","element":"span"},{"text":"RingLWE ","element":"span"},{"text":"implies hardness of online learning slightly superlinear-sized quantum circuits, we first show, via a known construction, that this class of quantum circuits can implement a pseudorandom function class. This implies hardness of online learning, because pseudorandom function classes are hard to learn, an intuition that we formalize specifically for mistake-bounded online learning. Importantly, as the hardness arises from an underlying classical pseudorandom function class, it persists even when all challenges consists of input states and projective effect operators that are computational basis elements. In particular, online learning remains hard here, even though the learner can efficiently read all challenges. Additionally, as our pseudorandom function class is secure against adversaries that have query access to the function, the computational hardness of online learning holds even if the learner can actively choose the (pairwise distinct)","element":"span"}],[{"text":"challenges, rather than those being chosen by the adversary.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Classical adaptive data analysis. ","element":"span"},{"text":"In the problem of shadow tomography of quantum channels, the learner’s task is to output values ","element":"span"},{"style":{"height":14.62},"width":138.16,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/12-0.png","element":"img","alt":" bt ∈ R","inline":true,"padRight":true},{"text":"such that ","element":"span"},{"style":{"height":24.05},"width":454.01,"height":60.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/12-1.png","element":"img","alt":" |bt − Tr[M(t)B NA→B(ρ(t)A","inline":true,"padRight":true},{"text":")]","element":"span"},{"style":{"height":17.6},"width":111.18,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/12-2.png","element":"img","alt":"| ≤ ε","inline":true,"padRight":true},{"text":"for all ","element":"span"},{"style":{"height":17.6},"width":310.5,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/12-3.png","element":"img","alt":"t ∈ {1, 2, . . . , T}","inline":true},{"text":", where ","element":"span"},{"style":{"height":24.05},"width":207.15,"height":60.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/12-4.png","element":"img","alt":" {(ρ(t)A , M(t)B","inline":true,"padRight":true},{"text":")","element":"span"},{"style":{"height":19.57},"width":77.38,"height":48.92,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/12-5.png","element":"img","alt":"}Tt=1","inline":true,"padRight":true},{"text":"is a set of state-measurement pairs. They should do so ","element":"span"},{"text":"while minimizing the number ","element":"span"},{"style":{"fontStyle":"italic"},"text":"k ","element":"span"},{"text":"of times that they access the channel. The shadow tomography problem for multi-time processes is formulated in a similar manner.","element":"span"}],[{"text":"Our shadow tomography results are based on Bell sampling of the Choi state of the channel, i.e., sending one-half of the maximally-entangled state through the channel and then performing a joint Bell-basis measurement on the output system and the second entangled copy of the input system. If the unknown channel is a Pauli channel, then this procedure directly gives us samples from the Pauli error rate distribution. Consequently, the desired expectation values Tr[","element":"span"},{"style":{"height":24.05},"width":399.62,"height":60.12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/12-6.png","element":"img","alt":"M(t)B NA→B(ρ(t)A )] can","inline":true,"padRight":true},{"text":"be interpreted as (classical) statistical queries with respect to the (unknown) Pauli error rate vector. Specifically, we have that Tr[","element":"span"},{"style":{"height":24.05},"width":680.42,"height":60.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/12-7.png","element":"img","alt":"M(t)B NA→B(ρ(t)A )] = p · e(t), where p","inline":true,"padRight":true},{"text":"is the Pauli error rate vector and ","element":"span"},{"style":{"height":25.55},"width":604.79,"height":63.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/12-8.png","element":"img","alt":"e(t) = (e(t)z,x)z,x∈{0,1}n, ez,x = Tr","inline":true},{"text":"[((","element":"span"},{"style":{"height":24.05},"width":60.96,"height":60.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/12-9.png","element":"img","alt":"ρ(t)A","inline":true,"padRight":true},{"text":")","element":"span"},{"style":{"height":24.05},"width":158.31,"height":60.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/12-10.png","element":"img","alt":"T ⊗ M(t)B","inline":true,"padRight":true},{"text":")Γ","element":"span"},{"style":{"height":22.14},"width":60.78,"height":55.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/12-11.png","element":"img","alt":"z,xA,B","inline":true},{"text":"]. With this observation, we can make use of ","element":"span"},{"text":"the classical adaptive data analysis algorithms presented in Ref. [","element":"span"},{"href":"#id-45","referenceIndex":56,"text":"56","element":"a"},{"text":"] in order to obtain our bound in Theorem ","element":"span"},{"href":"#id-46","text":"4 ","element":"a"},{"text":"on the number ","element":"span"},{"style":{"fontStyle":"italic"},"text":"k ","element":"span"},{"text":"of accesses to the channel. For an arbitrary unknown channel, we show that the same Bell sampling strategy enables us to sample from the error-rate vector of the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Pauli-twirled ","element":"span"},{"text":"version of the channel, which we denote by ","element":"span"},{"style":{"height":20.06},"width":121.05,"height":50.16,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/12-12.png","element":"img","alt":" N PA→B","inline":true},{"text":". Then, as long as ","element":"span"},{"style":{"height":21.29},"width":337.94,"height":53.23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/12-13.png","element":"img","alt":" ε > 12∥N − N P∥⋄,","inline":true,"padRight":true},{"text":"we obtain a similar guarantee as in Theorem ","element":"span"},{"href":"#id-46","text":"4","element":"a"},{"text":".","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Quantum combs. ","element":"span"},{"text":"Our results on multi-time quantum processes make use of theory of quantum combs [","element":"span"},{"href":"#id-51","referenceIndex":57,"text":"57","element":"a"},{"text":"], also known as “quantum strategies” [","element":"span"},{"href":"#id-56","referenceIndex":60,"text":"60","element":"a"},{"text":"]. We provide formal definitions of quantum combs in Appendix ","element":"span"},{"text":"B","element":"span"},{"text":". In particular, the analysis of multi-time processes entails use of the so-called “strategy norm” and its Hölder dual [","element":"span"},{"href":"#id-86","referenceIndex":86,"text":"86","element":"a"},{"text":"], which are the multi-time generalizations of the trace and spectral norm, respectively, used in the analysis of algorithms for quantum states. They also generalize the diamond norm and its Hölder dual in the case of quantum channels. A crucial ingredient in the proof of Theorem ","element":"span"},{"href":"#id-87","text":"6 ","element":"a"},{"text":"is submultiplicativity of the strategy norm under composition of multi-time processes. As the composition of multi-time processes is given by the link product [","element":"span"},{"href":"#id-51","referenceIndex":57,"text":"57","element":"a"},{"text":"], the technique for proving submultiplicativity of the diamond norm does not generalize straightforwardly to the strategy norm. To the best of our knowledge, a proof of submultiplicativity of the strategy norm under link product has not been provided before, and we provide such a proof in Appendix ","element":"span"},{"href":"#id-88","text":"B.2 ","element":"a"},{"text":"based on semi-definite programming duality.","element":"span"}],[{"id":"id-99","style":{"fontWeight":"bold"},"text":"1.6 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Directions for future work","element":"span"}],[{"text":"Motivated by the well known impossibility of online learning general unitaries and channels with a subexponential number of mistakes, our work initiates a study of subclasses of channels that allow for good regret and mistake bounds in online learning. We have identified two such classes: Channels of bounded gate complexity and mixtures of arbitrary known channels, with Pauli channels as a notable special case. However, we show that achieving favorably scaling regret and mistake bounds ","element":"span"},{"text":"for these classes is not possible in a computationally efficient manner. Our results open up several directions for future research; here we outline some of them.","element":"span"}],[{"text":"First, while our computational complexity lower bounds put limitations on where we can hope for computationally efficient channel online learning, they still leave relevant regions to explore. On the one hand, by the same reasoning as in Section ","element":"span"},{"href":"#id-89","text":"5.2","element":"a"},{"text":", the channel class under consideration must not be expressive enough to implement known pseudorandom function constructions. Here, inspired by the recent work [","element":"span"},{"href":"#id-90","referenceIndex":87,"text":"87","element":"a"},{"text":"], one may investigate the online learnability of shallow quantum circuits. On the other hand, as we argue in detail in Section ","element":"span"},{"href":"#id-91","text":"5.1","element":"a"},{"text":", a necessary condition for efficient online learning is that the channels of interest admit efficient descriptions. Candidates for such channel classes may be Clifford circuits or channels represented by matrix product operators of low bond-dimension.","element":"span"}],[{"text":"Second, we believe that there is room for “onlinification” of other quantum learning scenarios. For instance, recent work [","element":"span"},{"href":"#id-92","referenceIndex":28,"text":"28","element":"a"},{"text":", ","element":"span"},{"href":"#id-77","referenceIndex":29,"text":"29","element":"a"},{"text":"] has proposed to avoid the exponential bottleneck of general process tomography (compare, e.g., Refs. [","element":"span"},{"href":"#id-93","referenceIndex":13,"text":"13","element":"a"},{"text":", ","element":"span"},{"href":"#id-5","referenceIndex":14,"text":"14","element":"a"},{"text":"]) by considering learning tasks with arbitrary channels but restricted or structured input states and output measurements. Similarly, one may attempt to circumvent exponential lower bounds in online learning arbitrarily complex channels by imposing restrictions on the behavior of the adversary, for instance, with respect to the challenges that they can pose. Another line of quantum learning research that has recently seen significant progress is learning a Hamiltonian from access to the associated dynamics [","element":"span"},{"href":"#id-77","referenceIndex":29,"text":"29","element":"a"},{"text":", ","element":"span"},{"href":"#id-94","referenceIndex":88,"text":"88","element":"a"},{"text":"–","element":"span"},{"href":"#id-95","referenceIndex":100,"text":"100","element":"a"},{"text":"]. Online learning variants of the standard Hamiltonian learning task could give insights into whether one can learn to fine-tune a Hamiltonian evolution according to adaptive feedback.","element":"span"}],[{"text":"Finally, while general shadow tomography for quantum channels is not possible, we have demonstrated that it can become feasible for a suitable subclass, in our case Pauli channels. Given the important role of online learning quantum states in state tomography procedures, we envision positive results, both ours and future ones, on online learning restricted classes of channels to serve as a stepping stone towards shadow tomography procedures for such classes. Achieving the latter would likely require analogues of threshold search [","element":"span"},{"href":"#id-26","referenceIndex":17,"text":"17","element":"a"},{"text":"] for these kinds of channels.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Acknowledgments","element":"span"}],[{"text":"The authors thank Akshay Bansal, Ian George, Soumik Ghosh, Jamie Sikora, and Alice Zheng for sharing a draft of their independent and concurrent work on online learning quantum objects. The authors gratefully acknowledge support from the BMBF (QPIC-1, HYBRID), the ERC (DebuQC), the Munich Quantum Valley, the Einstein Foundation, and Berlin Quantum. MCC was partially supported by a DAAD PRIME fellowship.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Table of Contents","element":"figcaption","subtype":"caption"}],[{"style":{"fontWeight":"bold"},"text":"1 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Introduction ","element":"span"},{"style":{"fontWeight":"bold"},"text":"1","element":"span"}],[{"text":"1.1 ","element":"span"},{"href":"#id-19","text":"Statement of the problem ","element":"a"},{"text":". . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ","element":"span"},{"text":"3","element":"span"}],[{"text":"1.2 ","element":"span"},{"href":"#id-49","text":"Overview of the main results ","element":"a"},{"text":". . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ","element":"span"},{"text":"4","element":"span"}],[{"text":"1.3 ","element":"span"},{"href":"#id-96","text":"Extensions of our results ","element":"a"},{"text":". . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ","element":"span"},{"text":"7","element":"span"}],[{"text":"1.4 ","element":"span"},{"href":"#id-97","text":"Related work ","element":"a"},{"text":". . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ","element":"span"},{"text":"9","element":"span"}],[{"text":"1.5 ","element":"span"},{"href":"#id-98","text":"Techniques and proof overview ","element":"a"},{"text":". . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ","element":"span"},{"text":"11","element":"span"}],[{"text":"1.6 ","element":"span"},{"href":"#id-99","text":"Directions for future work ","element":"a"},{"text":". . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ","element":"span"},{"text":"13","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"2 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Preliminaries ","element":"span"},{"style":{"fontWeight":"bold"},"text":"16","element":"span"}],[{"text":"2.1 ","element":"span"},{"href":"#id-52","text":"Basics of quantum information ","element":"a"},{"text":". . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ","element":"span"},{"text":"16","element":"span"}],[{"text":"2.2 ","element":"span"},{"href":"#id-100","text":"Basics of online learning ","element":"a"},{"text":". . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ","element":"span"},{"text":"22","element":"span"}],[{"text":"2.3 ","element":"span"},{"href":"#id-101","text":"The multiplicative weights framework and corresponding guarantees ","element":"a"},{"text":". . . . . . . . . ","element":"span"},{"text":"25","element":"span"}],[{"text":"2.4 ","element":"span"},{"href":"#id-27","text":"Problem statement: Online learning classes of quantum channels ","element":"a"},{"text":". . . . . . . . . . . ","element":"span"},{"text":"27","element":"span"}],[{"text":"2.5 ","element":"span"},{"href":"#id-63","text":"Obstacles to online learning via the Choi state ","element":"a"},{"text":". . . . . . . . . . . . . . . . . . . . . ","element":"span"},{"text":"28","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"3 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Online learning upper bounds ","element":"span"},{"style":{"fontWeight":"bold"},"text":"30","element":"span"}],[{"text":"3.1 ","element":"span"},{"href":"#id-102","text":"Regret bound for channels of bounded gate complexity ","element":"a"},{"text":". . . . . . . . . . . . . . . . . ","element":"span"},{"text":"30","element":"span"}],[{"text":"3.2 ","element":"span"},{"href":"#id-103","text":"Regret bound for mixtures of known channels ","element":"a"},{"text":". . . . . . . . . . . . . . . . . . . . . . ","element":"span"},{"text":"33","element":"span"}],[{"text":"3.3 ","element":"span"},{"href":"#id-60","text":"Regret bounds for multi-time processes ","element":"a"},{"text":". . . . . . . . . . . . . . . . . . . . . . . . . . ","element":"span"},{"text":"36","element":"span"}],[{"text":"3.4 ","element":"span"},{"href":"#id-39","text":"Learning-theoretic implications ","element":"a"},{"text":". . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ","element":"span"},{"text":"42","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"4 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Mistake lower bounds ","element":"span"},{"style":{"fontWeight":"bold"},"text":"45","element":"span"}],[{"text":"4.1 ","element":"span"},{"href":"#id-29","text":"Mistake lower bounds for general unitaries and channels ","element":"a"},{"text":". . . . . . . . . . . . . . . . ","element":"span"},{"text":"45","element":"span"}],[{"text":"4.2 ","element":"span"},{"href":"#id-104","text":"Mistake lower bounds for channels of bounded complexity ","element":"a"},{"text":". . . . . . . . . . . . . . . ","element":"span"},{"text":"47","element":"span"}],[{"text":"4.3 ","element":"span"},{"href":"#id-105","text":"Mistake lower bounds for Pauli channels ","element":"a"},{"text":". . . . . . . . . . . . . . . . . . . . . . . . . ","element":"span"},{"text":"48","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"5 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Computational complexity lower bounds ","element":"span"},{"style":{"fontWeight":"bold"},"text":"49","element":"span"}],[{"text":"5.1 ","element":"span"},{"href":"#id-91","text":"Computational complexity lower bounds for Pauli channels ","element":"a"},{"text":". . . . . . . . . . . . . . ","element":"span"},{"text":"49","element":"span"}],[{"text":"5.2 ","element":"span"},{"href":"#id-89","text":"Computational complexity lower bounds for channels of bounded complexity ","element":"a"},{"text":". . . . ","element":"span"},{"text":"51","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"6 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Shadow tomography of quantum processes ","element":"span"},{"style":{"fontWeight":"bold"},"text":"55","element":"span"}],[{"text":"6.1 ","element":"span"},{"href":"#id-106","text":"Shadow tomography of multi-time processes ","element":"a"},{"text":". . . . . . . . . . . . . . . . . . . . . . . ","element":"span"},{"text":"57","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Bibliography ","element":"span"},{"style":{"fontWeight":"bold"},"text":"60","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"A From qubits to qudits ","element":"span"},{"style":{"fontWeight":"bold"},"text":"69","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"B Multi-time quantum processes ","element":"span"},{"style":{"fontWeight":"bold"},"text":"71","element":"span"}],[{"text":"B.1 ","element":"span"},{"href":"#id-107","text":"Definitions and basic properties ","element":"a"},{"text":". . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ","element":"span"},{"text":"71","element":"span"}],[{"text":"B.2 ","element":"span"},{"href":"#id-88","text":"Norms ","element":"a"},{"text":". . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ","element":"span"},{"text":"73","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"C Pauli-twirl of quantum channels ","element":"span"},{"style":{"fontWeight":"bold"},"text":"76","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"D Entropic analysis of the MMW algorithm ","element":"span"},{"style":{"fontWeight":"bold"},"text":"78","element":"span"}],[{"href":"#id-108","text":"D.1 Proof of Proposition 14 ","element":"a"},{"text":". . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ","element":"span"},{"text":"79","element":"span"}],[{"href":"#id-109","text":"D.2 The projected MMW algorithm ","element":"a"},{"text":". . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ","element":"span"},{"text":"81","element":"span"}]]},{"heading":"2 Preliminaries","paragraphs":[[{"id":"id-52","style":{"fontWeight":"bold"},"text":"2.1 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Basics of quantum information","element":"span"}],[{"text":"Here we provide a brief review of fundamental quantum information concepts that we make use of throughout this work. We refer to, e.g., Ref. [","element":"span"},{"href":"#id-110","referenceIndex":101,"text":"101","element":"a"},{"text":"] for further details on the concepts and definitions presented in this subsection.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Pauli operators. ","element":"span"},{"text":"For a quantum system of ","element":"span"},{"style":{"height":12.8},"width":111.53,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-0.png","element":"img","alt":" n ∈ N","inline":true,"padRight":true},{"text":"qubits, we define the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"style":{"fontStyle":"italic"},"text":"-qubit Pauli operators ","element":"span"},{"text":"as","element":"span"}],[{"style":{"height":17.6},"width":1499.72,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-1.png","element":"img","alt":"P z,x := (+i)z·xZzXx, x, z ∈ {0, 1}n, (2.1)","inline":true},{"id":"id-112","style":{"height":17.6},"width":1468.13,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-2.png","element":"img","alt":"Zz := Zz1 ⊗ Zz2 ⊗ · · · ⊗ Zzn, Zz = |0⟩⟨0| + (−1)z|1⟩⟨1|, (2.2)","inline":true},{"style":{"height":17.6},"width":1476.77,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-3.png","element":"img","alt":"Xx := Xx1 ⊗ Xx2 ⊗ · · · ⊗ Xxn, Xx = |x⟩⟨0| + |x ⊕ 1⟩⟨1|, (2.3)","inline":true}],[{"text":"where the operation “","element":"span"},{"style":{"height":12},"width":34,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-4.png","element":"img","alt":"⊕","inline":true},{"text":"” denotes addition modulo two. From this, we can define the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"style":{"fontStyle":"italic"},"text":"-qubit Bell states ","element":"span"},{"text":"as","element":"span"}],[{"id":"id-113","style":{"width":"87%"},"width":1638,"height":119,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-5.png","element":"img"}],[{"text":"for all ","element":"span"},{"style":{"height":17.6},"width":254.55,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-6.png","element":"img","alt":" z, x ∈ {0, 1}n","inline":true},{"text":". The Bell state vectors ","element":"span"},{"style":{"height":17.6},"width":111.62,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-7.png","element":"img","alt":" |Φz,x⟩","inline":true,"padRight":true},{"text":"form an orthonormal basis for (","element":"span"},{"style":{"height":19.53},"width":314.05,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-8.png","element":"img","alt":"C2)⊗n ⊗ (C2)⊗n,","inline":true,"padRight":true},{"text":"and the set ","element":"span"},{"style":{"height":19.95},"width":455.13,"height":49.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-9.png","element":"img","alt":" {Φz,x}z,x∈{0,1}n forms a","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"positive operator-valued measure ","element":"span"},{"text":"(POVM).","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Quantum channels. ","element":"span"},{"text":"A quantum channel is a completely positive trace-preserving (CPTP) linear map ","element":"span"},{"style":{"height":19.54},"width":358.22,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-10.png","element":"img","alt":" N : L(Cd) → L(Cd′","inline":true},{"text":"). We often write ","element":"span"},{"style":{"height":16.7},"width":121.05,"height":41.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-11.png","element":"img","alt":" NA→B","inline":true,"padRight":true},{"text":"to refer to a quantum channel with input system ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"and output system ","element":"span"},{"style":{"fontStyle":"italic"},"text":"B","element":"span"},{"text":", with corresponding Hilbert spaces ","element":"span"},{"style":{"height":17.84},"width":490.82,"height":44.59,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-12.png","element":"img","alt":" HA ∼= CdA and HB ∼= CdB","inline":true},{"text":", respectively. We let ","element":"span"},{"text":"CPTP","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"A","element":"span"},{"text":"; ","element":"span"},{"style":{"fontStyle":"italic"},"text":"B","element":"span"},{"text":") denote the set of all quantum channels mapping system ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"to system ","element":"span"},{"style":{"fontStyle":"italic"},"text":"B","element":"span"},{"text":". In this work, we mostly consider the case ","element":"span"},{"style":{"height":14.7},"width":339.51,"height":36.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-13.png","element":"img","alt":" dA = dB = d = 2n","inline":true},{"text":", corresponding to a system of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"qubits, and we use the notation ","element":"span"},{"style":{"height":15.02},"width":134.34,"height":37.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-14.png","element":"img","alt":" CPTPn","inline":true,"padRight":true},{"text":"to refer to the set of all quantum channels mapping ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"qubits to ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"qubits.","element":"span"}],[{"text":"The ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Choi representation ","element":"span"},{"text":"(or ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Choi matrix","element":"span"},{"text":") of a linear map ","element":"span"},{"style":{"height":19.54},"width":358.81,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-15.png","element":"img","alt":" N : L(Cd) → L(Cd′","inline":true},{"text":") is defined as","element":"span"}],[{"id":"id-111","style":{"width":"76%"},"width":1430,"height":129,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-16.png","element":"img"}],[{"text":"where id","element":"span"},{"style":{"height":19.53},"width":329.57,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-17.png","element":"img","alt":"d : L(Cd) → L(Cd","inline":true},{"text":") is the identity superoperator, and","element":"span"}],[{"style":{"width":"57%"},"width":1083,"height":123,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-18.png","element":"img"}],[{"text":"The ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Choi state ","element":"span"},{"text":"of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":"is the normalized Choi matrix, defined as","element":"span"}],[{"style":{"width":"69%"},"width":1310,"height":94,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/15-19.png","element":"img"}],[{"text":"We sometimes write ","element":"span"},{"style":{"height":24.06},"width":598.59,"height":60.16,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-0.png","element":"img","alt":" CNA,B ≡ C(N) and ΦNA,B ≡ Φ(N","inline":true},{"text":") for the Choi matrix and Choi state, respectively, ","element":"span"},{"text":"of a quantum channel ","element":"span"},{"style":{"height":16.7},"width":121.05,"height":41.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-1.png","element":"img","alt":" NA→B","inline":true,"padRight":true},{"text":"when we want to indicate explicitly the input and output systems ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"B ","element":"span"},{"text":"of the channel. For a quantum channel ","element":"span"},{"style":{"height":16.7},"width":121.05,"height":41.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-2.png","element":"img","alt":" NA→B","inline":true},{"text":", its Choi representation ","element":"span"},{"style":{"height":24.07},"width":91.97,"height":60.16,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-3.png","element":"img","alt":" CNA,B","inline":true,"padRight":true},{"text":"is positive ","element":"span"},{"text":"semi-definite and satisfies Tr","element":"span"},{"style":{"height":25.18},"width":272.51,"height":62.96,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-4.png","element":"img","alt":"B[CNA,B] = 1A.","inline":true}],[{"text":"The function ","element":"span"},{"style":{"fontStyle":"italic"},"text":"C ","element":"span"},{"text":"defined in (","element":"span"},{"href":"#id-111","text":"2.5","element":"a"},{"text":") has an inverse, such that we can identify every Hermitian operator ","element":"span"},{"style":{"height":18.3},"width":377.48,"height":45.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-5.png","element":"img","alt":" HA,B ∈ L(HA ⊗ HB","inline":true},{"text":") with a Hermiticity-preserving map ","element":"span"},{"style":{"height":20.24},"width":621.04,"height":50.59,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-6.png","element":"img","alt":" C−1(HA,B) : L(HA) → L(HB) as","inline":true}],[{"style":{"width":"81%"},"width":1520,"height":54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-7.png","element":"img"}],[{"text":"In particular, if ","element":"span"},{"style":{"height":17.5},"width":97.05,"height":43.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-8.png","element":"img","alt":" HA,B","inline":true,"padRight":true},{"text":"is positive semi-definite, then ","element":"span"},{"style":{"height":16.33},"width":72.24,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-9.png","element":"img","alt":" N H","inline":true,"padRight":true},{"text":"is completely positive. If in addition Tr","element":"span"},{"style":{"height":22.96},"width":264.1,"height":57.39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-10.png","element":"img","alt":"B[HA,B] = 1A","inline":true},{"text":", then ","element":"span"},{"style":{"height":16.33},"width":72.24,"height":40.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-11.png","element":"img","alt":" N H","inline":true,"padRight":true},{"text":"is a quantum channel. Consequently, the set ","element":"span"},{"text":"CPTP","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"A","element":"span"},{"text":"; ","element":"span"},{"style":{"fontStyle":"italic"},"text":"B","element":"span"},{"text":") of quantum channels is in one-to-one correspondence with the set ","element":"span"},{"style":{"height":17.6},"width":228.19,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-12.png","element":"img","alt":" CPTP′(A; B","inline":true},{"text":") :","element":"span"},{"style":{"height":18.7},"width":221.62,"height":46.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-13.png","element":"img","alt":"= {NA,B ∈","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":14.7},"width":185.6,"height":36.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-14.png","element":"img","alt":"HA ⊗ HB","inline":true},{"text":") :","element":"span"}],[{"id":"id-114","style":{"width":"99%"},"width":1872,"height":147,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-15.png","element":"img"}],[{"text":"denote the set of Choi matrices of quantum channels mapping ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"qubits to ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"qubits.","element":"span"}],[{"text":"Quantum channels also have a Kraus representation, such that","element":"span"}],[{"style":{"width":"69%"},"width":1297,"height":116,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-16.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":14.84},"width":489.34,"height":37.11,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-17.png","element":"img","alt":" r ∈ N and Kℓ : HA → HB","inline":true,"padRight":true},{"text":"is a linear operator for every ","element":"span"},{"style":{"height":17.6},"width":308.05,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-18.png","element":"img","alt":" ℓ ∈ {1, 2, . . . , r}.","inline":true}],[{"style":{"fontWeight":"bold"},"text":"Pauli channels. ","element":"span"},{"text":"A Pauli channel is a quantum channel whose Kraus operators are proportional to the Pauli operators defined in Equation (","element":"span"},{"href":"#id-112","text":"2.1","element":"a"},{"text":"). Specifically, an ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit Pauli channel (by definition) has the form","element":"span"}],[{"id":"id-197","style":{"width":"66%"},"width":1247,"height":98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-19.png","element":"img"}],[{"text":"where the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Pauli error rates ","element":"span"},{"style":{"height":13.02},"width":71.4,"height":32.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-20.png","element":"img","alt":" pz,x","inline":true,"padRight":true},{"text":"form a probability distribution, i.e., ","element":"span"},{"style":{"height":18.22},"width":172.22,"height":45.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-21.png","element":"img","alt":" pz,x ∈ [0,","inline":true,"padRight":true},{"text":"1] for all ","element":"span"},{"style":{"height":17.6},"width":254.34,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-22.png","element":"img","alt":" z, x ∈ {0, 1}n","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":19.94},"width":1320.91,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-23.png","element":"img","alt":"�z,x∈{0,1}n pz,x = 1. The Choi representation of a Pauli channel P is","inline":true}],[{"id":"id-153","style":{"width":"63%"},"width":1186,"height":98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-24.png","element":"img"}],[{"text":"where","element":"span"}],[{"style":{"width":"79%"},"width":1487,"height":49,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-25.png","element":"img"}],[{"text":"are the unnormalized versions of the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit Bell states defined in Equation (","element":"span"},{"href":"#id-113","text":"2.4","element":"a"},{"text":"). We let ","element":"span"},{"style":{"height":18.22},"width":197.88,"height":45.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-26.png","element":"img","alt":" p = (pz,x :","inline":true},{"style":{"height":17.6},"width":255.21,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-27.png","element":"img","alt":"z, x ∈ {0, 1}n","inline":true},{"text":") denote the 4","element":"span"},{"style":{"height":5.6},"width":21,"height":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-28.png","element":"img","alt":"n","inline":true},{"text":"-dimensional probability vector of error rates.","element":"span"}],[{"text":"We let ","element":"span"},{"style":{"height":15.02},"width":138.88,"height":37.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-29.png","element":"img","alt":" PAULIn","inline":true,"padRight":true},{"text":"be the set of all Pauli channels acting on ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit systems, and analogously to (","element":"span"},{"href":"#id-114","text":"2.9","element":"a"},{"text":"), we let","element":"span"}],[{"style":{"width":"84%"},"width":1589,"height":144,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/16-30.png","element":"img"}],[{"text":"be the set of all Choi matrices of Pauli channels, where","element":"span"}],[{"style":{"width":"92%"},"width":1727,"height":122,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-0.png","element":"img"}],[{"text":"denotes the probability simplex of ","element":"span"},{"style":{"height":12.8},"width":123.64,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-1.png","element":"img","alt":" m ∈ N","inline":true,"padRight":true},{"text":"elements. Depending on the context, we refer to a vectors ","element":"span"},{"style":{"height":17.6},"width":351.38,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-2.png","element":"img","alt":"p = (p1, p2, . . . , pm","inline":true},{"text":") with values ","element":"span"},{"style":{"height":17.6},"width":920.45,"height":44.01,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-3.png","element":"img","alt":" p1, p2, . . . , pm ∈ [0, 1] and �mk=1 pk = 1 as both a","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"probability vector ","element":"span"},{"text":"and a ","element":"span"},{"style":{"fontStyle":"italic"},"text":"probability distribution","element":"span"},{"text":".","element":"span"}],[{"text":"Associated to every quantum channel ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":"is its Pauli-twirled version, defined as [","element":"span"},{"href":"#id-115","referenceIndex":102,"text":"102","element":"a"},{"text":"]","element":"span"}],[{"style":{"width":"74%"},"width":1394,"height":119,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-4.png","element":"img"}],[{"text":"where the superscript “","element":"span"},{"text":"P","element":"span"},{"text":"” in ","element":"span"},{"style":{"height":16.33},"width":64.24,"height":40.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-5.png","element":"img","alt":" N P","inline":true,"padRight":true},{"text":"refers to the set ","element":"span"},{"style":{"height":17.6},"width":210.4,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-6.png","element":"img","alt":" P := {P z,x","inline":true,"padRight":true},{"text":": ","element":"span"},{"style":{"height":17.6},"width":288.18,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-7.png","element":"img","alt":" z, x ∈ {0, 1}n}","inline":true,"padRight":true},{"text":"of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit Pauli operators. In other words, the Pauli-twirled version of a channel is given by applying a Pauli operator, chosen uniformly at random, and its inverse at the input and output of the channel. As we recall in Appendix ","element":"span"},{"text":"C","element":"span"},{"text":", the Pauli-twirled channel ","element":"span"},{"style":{"height":16.33},"width":64.24,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-8.png","element":"img","alt":" N P","inline":true,"padRight":true},{"text":"is indeed a Pauli channel, and the error rates are given by ","element":"span"},{"style":{"height":21.29},"width":810.98,"height":53.23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-9.png","element":"img","alt":" pz,x = 1dTr[Φz,xC(N)] for all z, x ∈ {0, 1}n","inline":true},{"text":". In particular, we show that the Choi ","element":"span"},{"text":"representation of the Pauli-twirled channel can be obtained via the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"pinching channel ","element":"span"},{"style":{"height":15.25},"width":48.42,"height":38.12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-10.png","element":"img","alt":" SP","inline":true,"padRight":true},{"text":"in the Bell basis, defined as","element":"span"}],[{"style":{"width":"86%"},"width":1618,"height":99,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-11.png","element":"img"}],[{"text":"for every linear operator ","element":"span"},{"style":{"height":12.8},"width":89.24,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-12.png","element":"img","alt":" X ∈","inline":true,"padRight":true},{"text":"L((","element":"span"},{"style":{"height":19.54},"width":115.1,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-13.png","element":"img","alt":"C2)⊗n","inline":true},{"text":"). In particular, then, the Choi representation of the Pauli-twirled version ","element":"span"},{"style":{"height":16.33},"width":64.23,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-14.png","element":"img","alt":" N P ","inline":true,"padRight":true},{"text":"of a channel ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":"is given by","element":"span"}],[{"id":"id-221","style":{"width":"94%"},"width":1780,"height":119,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-15.png","element":"img"}],[{"style":{"fontWeight":"bold"},"text":"Channel measurements and observables. ","element":"span"},{"text":"An ","element":"span"},{"style":{"fontStyle":"italic"},"text":"m","element":"span"},{"text":"-outcome measurement of a quantum state is given by a ","element":"span"},{"style":{"fontStyle":"italic"},"text":"positive operator-valued measure ","element":"span"},{"text":"(POVM), i.e., a set ","element":"span"},{"style":{"height":20.58},"width":289.86,"height":51.45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-16.png","element":"img","alt":" {M(i)}mi=1 of m","inline":true,"padRight":true},{"text":"operators satisfying ","element":"span"},{"text":"0 ","element":"span"},{"style":{"height":19.85},"width":217.59,"height":49.64,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-17.png","element":"img","alt":" ≤ M(i) ≤ 1","inline":true,"padRight":true},{"text":"such that ","element":"span"},{"style":{"height":21.86},"width":277.98,"height":54.65,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-18.png","element":"img","alt":"�mi=1 M(i) = 1","inline":true},{"text":". Observables for states are simply Hermitian operators ","element":"span"},{"style":{"fontStyle":"italic"},"text":"H","element":"span"},{"text":", ","element":"span"},{"text":"and the expected value of the observable ","element":"span"},{"style":{"fontStyle":"italic"},"text":"H ","element":"span"},{"text":"when measured on a state ","element":"span"},{"style":{"height":17.6},"width":224.8,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-19.png","element":"img","alt":" ρ is Tr[Hρ].","inline":true}],[{"text":"Now, a measurement, or a ","element":"span"},{"style":{"fontStyle":"italic"},"text":"test","element":"span"},{"text":", for a quantum channel ","element":"span"},{"style":{"height":16.7},"width":121.05,"height":41.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-20.png","element":"img","alt":" NA→B","inline":true,"padRight":true},{"text":"is given by a pair consisting of a bipartite state ","element":"span"},{"style":{"height":13.1},"width":82.66,"height":32.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-21.png","element":"img","alt":" ρR,A","inline":true,"padRight":true},{"text":"and a POVM ","element":"span"},{"style":{"height":26.85},"width":205.07,"height":67.12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-22.png","element":"img","alt":" {M(i)R,B}mi=1","inline":true},{"text":", where ","element":"span"},{"style":{"fontStyle":"italic"},"text":"R ","element":"span"},{"text":"is an arbitrary memory/reference ","element":"span"},{"text":"system; see Figure ","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":"(a). Using Equation (","element":"span"},{"href":"#id-116","text":"1.6","element":"a"},{"text":"), the probability of obtaining a particular outcome ","element":"span"},{"style":{"height":17.6},"width":310.5,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-23.png","element":"img","alt":"i ∈ {1, 2, . . . , m}","inline":true,"padRight":true},{"text":"of the test is given by","element":"span"}],[{"id":"id-140","style":{"width":"69%"},"width":1311,"height":68,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-24.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":26.85},"width":698.72,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-25.png","element":"img","alt":" E(i)A,B = TrR[(1B ⊗ ρTAR,A)(1A ⊗ M(i)R,B","inline":true},{"text":")] for all ","element":"span"},{"style":{"height":17.6},"width":312.66,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-26.png","element":"img","alt":" i ∈ {1, 2, . . . , m}","inline":true},{"text":". The right-hand side of this ","element":"span"},{"text":"equation is the generalized Born rule for quantum channels; see Figure ","element":"span"},{"href":"#id-15","text":"1","element":"a"},{"text":"(c) for a depiction. The channel test operators ","element":"span"},{"style":{"height":26.85},"width":93,"height":67.14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-27.png","element":"img","alt":" E(i)A,B","inline":true,"padRight":true},{"text":"satisfy ","element":"span"},{"style":{"height":26.85},"width":143.73,"height":67.14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-28.png","element":"img","alt":" E(i)A,B ≥","inline":true,"padRight":true},{"text":"0 for all ","element":"span"},{"style":{"height":17.6},"width":314.37,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-29.png","element":"img","alt":" i ∈ {1, 2, . . . , m}","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":26.85},"width":420.88,"height":67.14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-30.png","element":"img","alt":" �mi=1 E(i)A,B = ρTA ⊗ 1B","inline":true},{"text":", ","element":"span"},{"text":"where ","element":"span"},{"style":{"height":18.7},"width":275.78,"height":46.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/17-31.png","element":"img","alt":" ρA ≡ TrR[ρR,A","inline":true},{"text":"]. The converse is also true [","element":"span"},{"href":"#id-55","referenceIndex":59,"text":"59","element":"a"},{"text":", ","element":"span"},{"href":"#id-117","referenceIndex":103,"text":"103","element":"a"},{"text":"], meaning that every channel test can be ","element":"span"},{"text":"characterized by a set ","element":"span"},{"style":{"height":26.85},"width":194.61,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-0.png","element":"img","alt":" {E(i)A,B}mi=1","inline":true,"padRight":true},{"text":"such that 0 ","element":"span"},{"style":{"height":26.85},"width":358.18,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-1.png","element":"img","alt":" ≤ E(i)A,B ≤ σA ⊗ 1B","inline":true,"padRight":true},{"text":"for all ","element":"span"},{"style":{"height":17.6},"width":310.58,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-2.png","element":"img","alt":" i ∈ {1, 2, . . . , m}","inline":true},{"text":", for some ","element":"span"},{"text":"density operator ","element":"span"},{"style":{"height":10.3},"width":49.94,"height":25.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-3.png","element":"img","alt":" σA","inline":true},{"text":", and ","element":"span"},{"style":{"height":26.85},"width":419.29,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-4.png","element":"img","alt":"�mi=1 E(i)A,B = σA ⊗ 1B","inline":true},{"text":". The corresponding “physical realization” of the ","element":"span"},{"text":"channel test is given by ","element":"span"},{"style":{"height":18.7},"width":809.66,"height":46.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-5.png","element":"img","alt":" ρR,A = |ψσ⟩⟨ψσ|R,A, where HR ∼= HA, |ψσ⟩","inline":true,"padRight":true},{"text":"is a purification of ","element":"span"},{"style":{"height":15.2},"width":123.24,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-6.png","element":"img","alt":" σ, and","inline":true}],[{"style":{"width":"61%"},"width":1151,"height":76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-7.png","element":"img"}],[{"text":"An especially simple example of a channel test is one without memory, involving an input state ","element":"span"},{"style":{"height":14.4},"width":100.81,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-8.png","element":"img","alt":" ρA at","inline":true,"padRight":true},{"text":"the input of the channel and a measurement ","element":"span"},{"style":{"height":24.05},"width":185.46,"height":60.14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-9.png","element":"img","alt":" {M(i)B }mi=1 ","inline":true,"padRight":true},{"text":"at the output of a channel; see Figure ","element":"span"},{"href":"#id-15","text":"1","element":"a"},{"text":"(b). ","element":"span"},{"text":"In this case, the channel test operators are in tensor-product form, given by ","element":"span"},{"style":{"height":26.85},"width":411.22,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-10.png","element":"img","alt":" E(i)A,B = ρTA ⊗ M(i)B for","inline":true,"padRight":true},{"text":"all ","element":"span"},{"style":{"height":17.6},"width":322.31,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-11.png","element":"img","alt":" i ∈ {1, 2, . . . , m}.","inline":true}],[{"text":"The statements above for measurements readily generalize to statements about observables (Hermitian operators), due to the fact that every Hermitian operator has a spectral decomposition, and the spectral projections form a POVM. Consequently, the expected value of an observable ","element":"span"},{"style":{"height":17.5},"width":112.51,"height":43.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-12.png","element":"img","alt":" HR,B,","inline":true,"padRight":true},{"text":"measured according to the general scenario depicted in Figure ","element":"span"},{"href":"#id-15","text":"1","element":"a"},{"text":"(c), is equal to","element":"span"}],[{"style":{"width":"69%"},"width":1308,"height":61,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-13.png","element":"img"}],[{"text":"where the “channel observable” is ","element":"span"},{"style":{"height":25.51},"width":671.56,"height":63.77,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-14.png","element":"img","alt":" OA,B = TrR[(1B ⊗ρTAR,A)(1A ⊗HR,B","inline":true},{"text":")]. If the measurement scheme ","element":"span"},{"text":"does not contain a memory, then ","element":"span"},{"style":{"height":18.64},"width":336.01,"height":46.59,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-15.png","element":"img","alt":" OA,B = ρTA ⊗ HB.","inline":true}],[{"text":"Let us now consider the case that the memory system ","element":"span"},{"style":{"fontStyle":"italic"},"text":"R ","element":"span"},{"text":"has the same dimension as the input system ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"of the channel, so let us make the relabeling ","element":"span"},{"style":{"height":12.8},"width":133.38,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-16.png","element":"img","alt":" R ≡ A′","inline":true},{"text":". Let us also suppose that ","element":"span"},{"style":{"height":17.5},"width":232.76,"height":43.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-17.png","element":"img","alt":" ρR,A ≡ ψA′A","inline":true,"padRight":true},{"text":"is a pure state. ","element":"span"},{"text":"Then, for every bipartite pure state ","element":"span"},{"style":{"height":15.6},"width":89.53,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-18.png","element":"img","alt":" ψA′A","inline":true},{"text":", there exists a state ","element":"span"},{"style":{"height":11.2},"width":47.56,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-19.png","element":"img","alt":" ρA","inline":true,"padRight":true},{"text":"such that","element":"span"}],[{"id":"id-118","style":{"width":"99%"},"width":1872,"height":789,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-20.png","element":"img"}],[{"text":"Note that a special case of the observable in Equation (","element":"span"},{"href":"#id-118","text":"2.22","element":"a"},{"text":") is when ","element":"span"},{"style":{"height":28.38},"width":155.14,"height":70.94,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-21.png","element":"img","alt":" ρA = 1AdA ","inline":true,"padRight":true},{"text":", which corresponds ","element":"span"},{"text":"to measuring ","element":"span"},{"style":{"height":17.5},"width":97.06,"height":43.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-22.png","element":"img","alt":" HA,B","inline":true,"padRight":true},{"text":"on the Choi state of the channel. In this case","element":"span"}],[{"style":{"width":"66%"},"width":1255,"height":96,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-23.png","element":"img"}],[{"text":"which means that","element":"span"}],[{"style":{"width":"65%"},"width":1234,"height":55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/18-24.png","element":"img"}],[{"text":"Now, the operator/spectral norm ","element":"span"},{"style":{"height":17.6},"width":89.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-0.png","element":"img","alt":" ∥·∥∞","inline":true,"padRight":true},{"text":"is used to characterize measurement operators and observables for quantum states, because it is the (Hölder) dual to the trace norm ","element":"span"},{"style":{"height":17.6},"width":72.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-1.png","element":"img","alt":" ∥·∥1","inline":true},{"text":". ","element":"span"},{"text":"For quantum channels, the relevant norm is the diamond norm [","element":"span"},{"href":"#id-119","referenceIndex":104,"text":"104","element":"a"},{"text":"]. Let ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":": L(","element":"span"},{"style":{"height":17.6},"width":138.59,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-2.png","element":"img","alt":"HA) →","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":14.7},"width":62.85,"height":36.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-3.png","element":"img","alt":"HB","inline":true},{"text":") be a Hermiticity-preserving linear map. The diamond norm of ","element":"span"},{"style":{"height":16.7},"width":121.04,"height":41.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-4.png","element":"img","alt":" NA→B","inline":true,"padRight":true},{"text":"can be expressed as","element":"span"}],[{"id":"id-123","style":{"width":"92%"},"width":1726,"height":475,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-5.png","element":"img"}],[{"text":"The norm ","element":"span"},{"style":{"height":17.6},"width":89.69,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-6.png","element":"img","alt":" ∥·∥⋄1","inline":true,"padRight":true},{"text":"is referred to as the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"strategy 1-norm","element":"span"},{"text":", and we define it formally in Appendix ","element":"span"},{"text":"B","element":"span"},{"text":". The relevant norm for channel observables is thus the Hölder dual of the strategy 1-norm, which is given by [","element":"span"},{"href":"#id-86","referenceIndex":86,"text":"86","element":"a"},{"text":", ","element":"span"},{"href":"#id-120","referenceIndex":105,"text":"105","element":"a"},{"text":"]","element":"span"}],[{"id":"id-121","style":{"width":"86%"},"width":1619,"height":371,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-7.png","element":"img"}],[{"text":"for every Hermitian operator ","element":"span"},{"style":{"height":17.5},"width":477.53,"height":43.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-8.png","element":"img","alt":" OA,B acting on HA ⊗ HB","inline":true},{"text":". We note that channel test operators satisfy ","element":"span"},{"style":{"height":18.7},"width":221.73,"height":46.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-9.png","element":"img","alt":"∥EA,B∥∗⋄1 ≤","inline":true,"padRight":true},{"text":"1. Also, for channel observables without memory, it holds that","element":"span"}],[{"style":{"width":"70%"},"width":1321,"height":46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-10.png","element":"img"}],[{"text":"as we might expect, and which is a property interesting in its own right; we refer to Appendix ","element":"span"},{"text":"B ","element":"span"},{"text":"for a proof. In general, we have that ","element":"span"},{"style":{"height":18.7},"width":409.68,"height":46.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-11.png","element":"img","alt":" ∥OA,B∥∗⋄1 ≥ ∥OA,B∥∞","inline":true,"padRight":true},{"text":"for all Hermitian ","element":"span"},{"style":{"height":17.5},"width":138.3,"height":43.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-12.png","element":"img","alt":" OA,B ∈","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":14.7},"width":180.31,"height":36.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-13.png","element":"img","alt":"HA ⊗ HB","inline":true},{"text":"). ","element":"span"},{"text":"This follows immediately from (","element":"span"},{"href":"#id-121","text":"2.26","element":"a"},{"text":"), on account of the fact that ","element":"span"},{"style":{"height":17.6},"width":557.94,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-14.png","element":"img","alt":" ∥σA∥∞ ≤ ∥σA∥1 = Tr[σA] =","inline":true,"padRight":true},{"text":"1 ","element":"span"},{"style":{"height":20.16},"width":363.33,"height":50.39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-15.png","element":"img","alt":" ⇒ −1A ≤ σA ≤ 1A","inline":true,"padRight":true},{"text":"for every ","element":"span"},{"style":{"height":10.3},"width":49.94,"height":25.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-16.png","element":"img","alt":" σA","inline":true,"padRight":true},{"text":"in the optimization in (","element":"span"},{"href":"#id-121","text":"2.26","element":"a"},{"text":"), and the fact that ","element":"span"},{"style":{"height":17.6},"width":269.02,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-17.png","element":"img","alt":" ∥H∥∞ = inf{t","inline":true,"padRight":true},{"text":": ","element":"span"},{"style":{"height":21.85},"width":335.08,"height":54.63,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-18.png","element":"img","alt":"−t1d ≤ H ≤ t1d}","inline":true,"padRight":true},{"text":"for all Hermitian ","element":"span"},{"style":{"height":19.53},"width":217.3,"height":48.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/19-19.png","element":"img","alt":" H ∈ L(Cd).","inline":true}],[{"style":{"fontWeight":"bold"},"text":"Multi-time quantum processes. ","element":"span"},{"text":"Multi-time quantum processes are those that occur over multiple time steps, as opposed simply one time step for a quantum channel. A simple example of a multi-time quantum process is the one depicted in Figure ","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":"(a), in which the process (in blue) consists of three (independent) uses of a quantum channel ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N","element":"span"},{"text":". In Figure ","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":"(b), we depict in blue a quantum multi-time process with memory, which can model non-Markovian dynamics [","element":"span"},{"href":"#id-57","referenceIndex":61,"text":"61","element":"a"},{"text":"]. Measurements, or ","element":"span"},{"style":{"fontStyle":"italic"},"text":"testers","element":"span"},{"text":", for multi-time processes consist of an input state to the process, several (possibly adaptive) interactions with the process over multiple time steps, and finally a measurement. The testers are depicted as the red processes in Figure ","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":".","element":"span"}],[{"text":"The mathematical objects describing multi-time processes and testers are known as ","element":"span"},{"style":{"fontStyle":"italic"},"text":"quantum combs ","element":"span"},{"text":"[","element":"span"},{"href":"#id-51","referenceIndex":57,"text":"57","element":"a"},{"text":", ","element":"span"},{"href":"#id-122","referenceIndex":106,"text":"106","element":"a"},{"text":"] and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"quantum strategies ","element":"span"},{"text":"[","element":"span"},{"href":"#id-56","referenceIndex":60,"text":"60","element":"a"},{"text":"], and we provide formal definitions of these objects in Appendix ","element":"span"},{"text":"B","element":"span"},{"text":". Briefly, quantum combs are multipartite operators that are defined as the Choi representations of multi-time quantum processes. For ","element":"span"},{"style":{"height":26.85},"width":863,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-0.png","element":"img","alt":" k ∈ N, let H(k)A,B ≡ HA1 ⊗ HB1 ⊗ HA2 ⊗ HB2 ⊗","inline":true},{"style":{"height":16.95},"width":319.14,"height":42.38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-1.png","element":"img","alt":"· · · ⊗ HAk ⊗ HBk","inline":true},{"text":", such that the systems ","element":"span"},{"style":{"height":16},"width":270.4,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-2.png","element":"img","alt":" A1, A2, . . . , Ak","inline":true,"padRight":true},{"text":"are the input systems and ","element":"span"},{"style":{"height":15.2},"width":271.52,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-3.png","element":"img","alt":" B1, B2, . . . , Bk","inline":true,"padRight":true},{"text":"are the output systems; see Figure ","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":". Then, the set of quantum combs describing ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r","element":"span"},{"text":"-step multi-time quantum processes is defined as follows:","element":"span"}],[{"id":"id-168","style":{"width":"97%"},"width":1831,"height":236,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-4.png","element":"img"}],[{"text":"We suppress the system labels and simply write ","element":"span"},{"style":{"height":15.02},"width":142.28,"height":37.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-5.png","element":"img","alt":" COMBr","inline":true,"padRight":true},{"text":"when the systems are unimportant or understood in the context being considered. For the process in blue in Figure ","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":"(b), its Choi representation belongs to the set ","element":"span"},{"style":{"height":15.02},"width":144.27,"height":37.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-6.png","element":"img","alt":" COMB3","inline":true},{"text":". We also note that ","element":"span"},{"style":{"height":15.02},"width":327.14,"height":37.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-7.png","element":"img","alt":" COMB1 = CPTP′","inline":true},{"text":". The set of inputs to multi-time processes, sometimes called ","element":"span"},{"style":{"fontStyle":"italic"},"text":"co-strategies","element":"span"},{"text":", are defined as the Choi representations of multi-time processes in which the first input system is trivial. Specifically, let ","element":"span"},{"style":{"height":26.85},"width":368.42,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-8.png","element":"img","alt":"�H(k)A,B ≡ C ⊗ HA1 ⊗","inline":true},{"style":{"height":16.95},"width":563.28,"height":42.38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-9.png","element":"img","alt":"· · · ⊗ HBk−1 ⊗ HAk, for k ∈ N","inline":true},{"text":". Then, the set of inputs to multi-time processes is defined as","element":"span"}],[{"id":"id-169","style":{"width":"86%"},"width":1616,"height":302,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-10.png","element":"img"}],[{"text":"An example is the red process in Figure ","element":"span"},{"href":"#id-50","text":"2","element":"a"},{"text":"(b) (excluding the measurement), which belongs to the set ","element":"span"},{"style":{"height":17.39},"width":144.27,"height":43.47,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-11.png","element":"img","alt":" COMB∗3","inline":true},{"text":". Again, we suppress the system labels and simply write ","element":"span"},{"style":{"height":17.39},"width":144.28,"height":43.47,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-12.png","element":"img","alt":" COMB∗r ","inline":true,"padRight":true},{"text":"when the systems are ","element":"span"},{"text":"unimportant or understood in the context being considered. Observe that every element of ","element":"span"},{"style":{"height":17.39},"width":144.28,"height":43.47,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-13.png","element":"img","alt":" COMB∗r","inline":true,"padRight":true},{"text":"is a positive semi-definite operator with unit trace. In particular, the elements of ","element":"span"},{"style":{"height":17.39},"width":144.27,"height":43.48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-14.png","element":"img","alt":" COMB∗r ","inline":true,"padRight":true},{"text":"are density ","element":"span"},{"text":"operators, and we can think of them as multi-time analogues of quantum states.","element":"span"}],[{"text":"The analogue of a POVM for multi-time processes, and thus the multi-time generalization of a channel test as we defined them above, is a ","element":"span"},{"style":{"fontStyle":"italic"},"text":"mult-time tester","element":"span"},{"text":": a set ","element":"span"},{"style":{"height":20.58},"width":173.1,"height":51.45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-15.png","element":"img","alt":" {E(i)}mi=1 ","inline":true,"padRight":true},{"text":"of positive semi-definite ","element":"span"},{"style":{"fontStyle":"italic"},"text":"test operators ","element":"span"},{"style":{"height":16.73},"width":121.29,"height":41.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-16.png","element":"img","alt":" E(i) ∈","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":26.85},"width":97.63,"height":67.14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-17.png","element":"img","alt":"H(r)A,B","inline":true},{"text":") such that 0 ","element":"span"},{"style":{"height":21.75},"width":348.46,"height":54.37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-18.png","element":"img","alt":" ≤ E(i) ≤ S ⊗ 1Br","inline":true,"padRight":true},{"text":"for all ","element":"span"},{"style":{"height":17.6},"width":322.51,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-19.png","element":"img","alt":" i ∈ {1, 2, . . . , m}","inline":true},{"text":", for some ","element":"span"},{"style":{"height":23.2},"width":744.33,"height":58.01,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-20.png","element":"img","alt":"S ∈ COMB∗r, and �mi=1 E(i)r = S ⊗ 1Br.","inline":true}],[{"text":"The multi-time analogues of the norms in (","element":"span"},{"href":"#id-123","text":"2.25","element":"a"},{"text":") and (","element":"span"},{"href":"#id-121","text":"2.26","element":"a"},{"text":") are the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"strategy ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r","element":"span"},{"style":{"fontStyle":"italic"},"text":"-norm ","element":"span"},{"text":"and its Hölder dual, which can be expressed as [","element":"span"},{"href":"#id-86","referenceIndex":86,"text":"86","element":"a"},{"text":"]","element":"span"}],[{"id":"id-167","style":{"width":"83%"},"width":1557,"height":167,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-21.png","element":"img"}],[{"text":"for every Hermitian operator ","element":"span"},{"style":{"height":12.8},"width":81.68,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-22.png","element":"img","alt":" O ∈","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":26.85},"width":97.64,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-23.png","element":"img","alt":"H(r)A,B","inline":true},{"text":"). It holds that ","element":"span"},{"style":{"height":17.6},"width":295.72,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-24.png","element":"img","alt":" ∥O∥∗⋄r ≥ ∥O∥∞","inline":true,"padRight":true},{"text":"for every Hermitian ","element":"span"},{"text":"operator ","element":"span"},{"style":{"height":12.8},"width":75.64,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-25.png","element":"img","alt":" O ∈","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":26.85},"width":97.63,"height":67.14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-26.png","element":"img","alt":"H(r)A,B","inline":true},{"text":"). For a multi-time test ","element":"span"},{"style":{"height":20.58},"width":173.1,"height":51.45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/20-27.png","element":"img","alt":" {E(i)}mi=1","inline":true,"padRight":true},{"text":"with ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r ","element":"span"},{"text":"time steps, it holds that every test","element":"span"}],[{"id":"id-163","style":{"width":"99%"},"width":1872,"height":149,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-0.png","element":"img"}],[{"text":"for all Hermitian ","element":"span"},{"style":{"height":26.85},"width":321,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-1.png","element":"img","alt":" O, N ∈ L(H(r)A,B).","inline":true}],[{"id":"id-100","style":{"fontWeight":"bold"},"text":"2.2 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Basics of online learning","element":"span"}],[{"text":"We view online learning a hypothesis class ","element":"span"},{"style":{"height":17.54},"width":147.31,"height":43.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-2.png","element":"img","alt":" F ⊆ YX","inline":true,"padRight":true},{"text":"of functions as an interactive game between a learner and an adversary. In round ","element":"span"},{"style":{"fontStyle":"italic"},"text":"t ","element":"span"},{"text":"of the interaction, the adversary challenges the learner with an input ","element":"span"},{"style":{"height":14.62},"width":121.49,"height":36.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-3.png","element":"img","alt":" xt ∈ X","inline":true},{"text":". Then, the learner predicts the output ","element":"span"},{"style":{"height":17.2},"width":190.38,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-4.png","element":"img","alt":" ft(xt) ∈ Y","inline":true,"padRight":true},{"text":"based on their current hypothesis ","element":"span"},{"style":{"height":15.6},"width":47.58,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-5.png","element":"img","alt":" ft.","inline":true,"padRight":true},{"text":"(We focus on ","element":"span"},{"style":{"fontStyle":"italic"},"text":"proper ","element":"span"},{"text":"online learning, where ","element":"span"},{"style":{"height":15.6},"width":282.26,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-6.png","element":"img","alt":" ft ∈ F for all t","inline":true},{"text":".) To conclude the round, the adversary provides the learner with feedback in the form of a loss ","element":"span"},{"style":{"height":17.2},"width":138.3,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-7.png","element":"img","alt":" ℓt(ft(xt","inline":true},{"text":")), and the learner uses this piece of information to update","element":"span"},{"href":"#id-124","style":{"height":8.4},"width":17,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-8.png","element":"img","alt":"3","inline":true,"padRight":true},{"text":"their hypothesis to ","element":"span"},{"style":{"height":19.53},"width":107.73,"height":48.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-9.png","element":"img","alt":" f(t+1)","inline":true},{"text":". For our purposes, the target space is ","element":"span"},{"text":"Y ","element":"span"},{"text":"= [0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1] and every ","element":"span"},{"style":{"height":8},"width":30.18,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-10.png","element":"img","alt":" ℓt","inline":true,"padRight":true},{"text":": [0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1] ","element":"span"},{"style":{"height":12},"width":88.92,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-11.png","element":"img","alt":" → R","inline":true,"padRight":true},{"text":"is convex and Lipschitz. Specific losses of interest are often of the form ","element":"span"},{"style":{"height":17.6},"width":290.59,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-12.png","element":"img","alt":"ℓt(y) = ℓ(y − bt","inline":true},{"text":") for some convex and Lipschitz ","element":"span"},{"style":{"height":0},"width":12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-13.png","element":"img","alt":" ℓ","inline":true,"padRight":true},{"text":": [0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1] ","element":"span"},{"style":{"height":8.8},"width":44,"height":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-14.png","element":"img","alt":" →","inline":true,"padRight":true},{"text":"[0","element":"span"},{"style":{"height":11.2},"width":63.4,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-15.png","element":"img","alt":", ∞","inline":true},{"text":") and ","element":"span"},{"style":{"height":14.62},"width":74.07,"height":36.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-16.png","element":"img","alt":" bt ∈","inline":true,"padRight":true},{"text":"[0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1]; for example, this gives rise to the ","element":"span"},{"style":{"height":17.02},"width":47.7,"height":42.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-17.png","element":"img","alt":" Lp","inline":true},{"text":"-losses with ","element":"span"},{"style":{"height":17.6},"width":188.94,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-18.png","element":"img","alt":" ℓ(·) = | · |p","inline":true},{"text":". In these cases, we assume that the adversary provides the loss ","element":"span"},{"style":{"height":17.2},"width":455.86,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-19.png","element":"img","alt":" ℓt(ft(xt)) = ℓ(ft(xt) − bt","inline":true},{"text":") by explicitly revealing the value of ","element":"span"},{"style":{"height":15.2},"width":246.22,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-20.png","element":"img","alt":" bt, and that ℓ","inline":true,"padRight":true},{"text":"is known in advance. Note that in general, the ","element":"span"},{"style":{"height":14.62},"width":30.73,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-21.png","element":"img","alt":" bt","inline":true,"padRight":true},{"text":"here can be arbitrary. If there is an “approximately true” underlying concept ","element":"span"},{"style":{"height":17.6},"width":670.28,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-22.png","element":"img","alt":" f∗ ∈ F such that |bt − f∗(xt)| ≤ ε/","inline":true},{"text":"3 holds for all ","element":"span"},{"style":{"fontStyle":"italic"},"text":"t","element":"span"},{"text":", then we speak of a realizable scenario, otherwise we call the setting non-realizable.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Regret-bounded online learning. ","element":"span"},{"text":"One way of phrasing desiderata in online learning is in terms of bounds on the difference between the incurred and the in hindsight optimal incurred loss, the so-called regret. Namely, after ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T ","element":"span"},{"text":"rounds of interaction, we say that the learner has incurred a regret of","element":"span"},{"style":{"height":8.4},"width":17,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-23.png","element":"img","alt":"4","inline":true}],[{"id":"id-125","style":{"width":"69%"},"width":1306,"height":113,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-24.png","element":"img"}],[{"text":"The goal of an online learner now is to achieve a small regret, in particular scaling sub-linearly with ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T","element":"span"},{"text":". Note that this model makes sense even if no restrictions on how the adversary chooses the ","element":"span"},{"style":{"height":17.2},"width":113.28,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-25.png","element":"img","alt":" ℓt (or,","inline":true,"padRight":true},{"text":"in the more concrete case of ","element":"span"},{"style":{"height":17.02},"width":47.7,"height":42.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-26.png","element":"img","alt":" Lp","inline":true},{"text":"-losses, the ","element":"span"},{"style":{"height":14.62},"width":30.72,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-27.png","element":"img","alt":" bt","inline":true,"padRight":true},{"text":"above) are imposed.","element":"span"}],[{"text":"We make use of the following lemma throughout this work.","element":"span"}],[{"id":"id-152","style":{"fontWeight":"bold"},"text":"Lemma 9 ","element":"span"},{"text":"(Regret bound)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Consider the online learning scenario described above, in which the functions ","element":"span"},{"style":{"height":8},"width":30.18,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-28.png","element":"img","alt":" ℓt","inline":true,"padRight":true},{"text":"are convex and differentiable. Then, for every ","element":"span"},{"style":{"height":15.6},"width":116.42,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-29.png","element":"img","alt":" f ∈ F","inline":true},{"text":", it holds that","element":"span"}],[{"id":"id-124","style":{"width":"99%"},"width":1873,"height":154,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/21-30.png","element":"img"}],[{"text":"The regret in (","element":"span"},{"href":"#id-125","text":"2.33","element":"a"},{"text":") is therefore bounded from above as","element":"span"}],[{"id":"id-126","style":{"width":"75%"},"width":1411,"height":122,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-0.png","element":"img"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Since ","element":"span"},{"style":{"height":8},"width":30.18,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-1.png","element":"img","alt":" ℓt","inline":true,"padRight":true},{"text":"is convex and differentiable, it readily follows that","element":"span"}],[{"style":{"width":"64%"},"width":1209,"height":48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-2.png","element":"img"}],[{"text":"for all ","element":"span"},{"style":{"height":17.6},"width":199.53,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-3.png","element":"img","alt":" y1, y2 ∈ [0,","inline":true,"padRight":true},{"text":"1]. (See, e.g., Ref. [","element":"span"},{"href":"#id-28","referenceIndex":46,"text":"46","element":"a"},{"text":", Chapter 2].) Consequently, for every ","element":"span"},{"style":{"height":15.6},"width":127.09,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-4.png","element":"img","alt":" f ∈ F,","inline":true}],[{"style":{"width":"74%"},"width":1404,"height":48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-5.png","element":"img"}],[{"text":"for all ","element":"span"},{"style":{"height":17.6},"width":304.47,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-6.png","element":"img","alt":" t ∈ {1, 2, . . . , T}","inline":true},{"text":". Therefore,","element":"span"}],[{"style":{"width":"84%"},"width":1592,"height":411,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-7.png","element":"img"}],[{"text":"The first inequality is (","element":"span"},{"href":"#id-124","text":"2.34","element":"a"},{"text":"). The inequality in (","element":"span"},{"href":"#id-126","text":"2.35","element":"a"},{"text":") follows because the function ","element":"span"},{"style":{"height":15.6},"width":133.87,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-8.png","element":"img","alt":" f ∈ F","inline":true,"padRight":true},{"text":"is arbitrary. ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-9.png","element":"img","alt":"■","inline":true}],[{"text":"Next, we recall results from Refs. [","element":"span"},{"href":"#id-78","referenceIndex":82,"text":"82","element":"a"},{"text":", ","element":"span"},{"href":"#id-79","referenceIndex":83,"text":"83","element":"a"},{"text":"], which demonstrate how to obtain regret bounds in terms of sequential complexity measures of the hypothesis class. To formulate these results, we introduce the following pieces of notation: We use ","element":"span"},{"style":{"fontWeight":"bold"},"text":"z ","element":"span"},{"style":{"height":19.57},"width":175.09,"height":48.92,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-10.png","element":"img","alt":" = (zt)Tt=1","inline":true,"padRight":true},{"text":"with labeling functions ","element":"span"},{"style":{"fontWeight":"bold"},"text":"z","element":"span"},{"style":{"height":19.14},"width":307.8,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-11.png","element":"img","alt":"t : {±1}t−1 → Z","inline":true,"padRight":true},{"text":"to ","element":"span"},{"text":"describe a complete rooted binary tree of depth ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T ","element":"span"},{"text":"with nodes labeled by elements of ","element":"span"},{"text":"Z","element":"span"},{"text":". That is, if we arrive at a node by following a path ","element":"span"},{"style":{"height":20.5},"width":444.29,"height":51.24,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-12.png","element":"img","alt":" π = (πs)t−1s=1 ∈ {±1}t−1","inline":true,"padRight":true},{"text":"of length ","element":"span"},{"style":{"height":11.2},"width":60.67,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-13.png","element":"img","alt":" t −","inline":true,"padRight":true},{"text":"1 from the root, ","element":"span"},{"text":"then ","element":"span"},{"style":{"fontWeight":"bold"},"text":"z ","element":"span"},{"text":"assigns the label ","element":"span"},{"style":{"fontWeight":"bold"},"text":"z","element":"span"},{"style":{"height":17.6},"width":56.35,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-14.png","element":"img","alt":"t(π","inline":true},{"text":") to that node. For notational convenience, if ","element":"span"},{"style":{"height":19.53},"width":203.36,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-15.png","element":"img","alt":" π ∈ {±1}T","inline":true,"padRight":true},{"text":"is a path of length ","element":"span"},{"style":{"height":12.8},"width":149.14,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-16.png","element":"img","alt":" T > t −","inline":true,"padRight":true},{"text":"1, we identify ","element":"span"},{"style":{"fontWeight":"bold"},"text":"z","element":"span"},{"style":{"height":17.2},"width":405.58,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-17.png","element":"img","alt":"t(π) = zt(π1, . . . , πt−1","inline":true},{"text":"). We can now introduce a notion of sequential covering to capture the effective size of a function class.","element":"span"}],[{"id":"id-127","style":{"fontWeight":"bold"},"text":"Definition 10 ","element":"span"},{"text":"(Sequential covering numbers [","element":"span"},{"href":"#id-78","referenceIndex":82,"text":"82","element":"a"},{"text":"])","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":17.54},"width":148.65,"height":43.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-18.png","element":"img","alt":" G ⊆ RZ","inline":true},{"text":", let ","element":"span"},{"style":{"fontWeight":"bold"},"text":"z ","element":"span"},{"style":{"height":19.57},"width":178.6,"height":48.92,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-19.png","element":"img","alt":" = (zt)Tt=1","inline":true,"padRight":true},{"text":"be a complete ","element":"span"},{"text":"rooted binary tree of depth ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T","element":"span"},{"text":", and let ","element":"span"},{"style":{"height":10.4},"width":66.47,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-20.png","element":"img","alt":" ε >","inline":true,"padRight":true},{"text":"0, ","element":"span"},{"style":{"height":14.8},"width":68.08,"height":37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-21.png","element":"img","alt":" p ≥","inline":true,"padRight":true},{"text":"1. We call a set ","element":"span"},{"style":{"fontStyle":"italic"},"text":"V ","element":"span"},{"text":"of ","element":"span"},{"text":"R","element":"span"},{"text":"-valued complete binary trees of depth ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T ","element":"span"},{"text":"a ","element":"span"},{"style":{"fontStyle":"italic"},"text":"sequential ","element":"span"},{"style":{"height":16},"width":484.27,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-22.png","element":"img","alt":" p-norm ε-cover of G on z","inline":true,"padRight":true},{"text":"if the following holds:","element":"span"}],[{"style":{"width":"83%"},"width":1565,"height":137,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-23.png","element":"img"}],[{"text":"The ","element":"span"},{"style":{"fontStyle":"italic"},"text":"sequential ","element":"span"},{"style":{"height":11.2},"width":175.1,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-24.png","element":"img","alt":" p-norm ε","inline":true},{"style":{"fontStyle":"italic"},"text":"-covering number of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"style":{"fontStyle":"italic"},"text":"on ","element":"span"},{"style":{"fontWeight":"bold"},"text":"z ","element":"span"},{"text":"is defined to be","element":"span"}],[{"style":{"width":"86%"},"width":1614,"height":45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-25.png","element":"img"}],[{"text":"We write ","element":"span"},{"style":{"height":17.6},"width":538.61,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/22-26.png","element":"img","alt":" NT (G, ε, p) = supz Nz(G, ε, p","inline":true},{"text":"), where the supremum is over trees of depth ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T","element":"span"},{"text":".","element":"span"}],[{"text":"Note that Definition ","element":"span"},{"href":"#id-127","text":"10 ","element":"a"},{"text":"does not require a cover to consist of ","element":"span"},{"text":"R","element":"span"},{"text":"-valued trees that can be realized within ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":". In that sense, the above notion is one of exterior covering. Also, it will be useful to observe that sequential covering numbers satisfy the monotonicity relation ","element":"span"},{"style":{"height":17.2},"width":452.82,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-0.png","element":"img","alt":" Nz(G, ε, p) ≤ Nz(G, ε, q)","inline":true,"padRight":true},{"text":"for 1 ","element":"span"},{"style":{"height":14.8},"width":261.06,"height":37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-1.png","element":"img","alt":" ≤ p ≤ q ≤ ∞.","inline":true}],[{"text":"Similarly to how empirical metric entropies, defined as the logarithm of covering numbers, control the generalization error in probably approximately correct learning, sequential metric entropies, the logarithm of sequential covering numbers, can be used to bound the regret in online learning. This result goes back to Refs. [","element":"span"},{"href":"#id-78","referenceIndex":82,"text":"82","element":"a"},{"text":", ","element":"span"},{"href":"#id-79","referenceIndex":83,"text":"83","element":"a"},{"text":"], we state it in a form similar to Ref. [","element":"span"},{"href":"#id-9","referenceIndex":20,"text":"20","element":"a"},{"text":", Theorem 9].","element":"span"}],[{"id":"id-80","style":{"fontWeight":"bold"},"text":"Theorem 11 ","element":"span"},{"text":"(Regret bound from sequential covering [","element":"span"},{"href":"#id-79","referenceIndex":83,"text":"83","element":"a"},{"text":", Theorem 3] and [","element":"span"},{"href":"#id-78","referenceIndex":82,"text":"82","element":"a"},{"text":", Theorem 7])","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":19.53},"width":204.16,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-2.png","element":"img","alt":"F ⊆ [0, 1]X","inline":true},{"text":". For every ","element":"span"},{"style":{"height":18.22},"width":491.2,"height":45.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-3.png","element":"img","alt":" t ∈ N≥1, let ℓt : [0, 1] → R","inline":true,"padRight":true},{"text":"be convex and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"L","element":"span"},{"text":"-Lipschitz. Then, there exists an online learning strategy that, when presented sequentially with ","element":"span"},{"style":{"height":15.6},"width":274.66,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-4.png","element":"img","alt":" x1, . . . , xT ∈ X","inline":true,"padRight":true},{"text":"and associated loss functions ","element":"span"},{"style":{"height":9.3},"width":176.26,"height":23.24,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-5.png","element":"img","alt":" ℓ1, . . . , ℓT","inline":true,"padRight":true},{"text":", outputs a sequence ","element":"span"},{"style":{"height":15.6},"width":275.4,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-6.png","element":"img","alt":" f1, . . . , fT ∈ F","inline":true,"padRight":true},{"text":"of hypotheses whose regret is bounded as","element":"span"}],[{"style":{"width":"95%"},"width":1784,"height":245,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-7.png","element":"img"}],[{"text":"Here, the sup","element":"span"},{"style":{"height":5.6},"width":21,"height":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-8.png","element":"img","alt":"x","inline":true,"padRight":true},{"text":"is a supremum over all complete binary trees of depth ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T ","element":"span"},{"text":"with nodes labeled by elements of ","element":"span"},{"text":"X","element":"span"},{"text":".","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Mistake-bounded online learning. ","element":"span"},{"text":"As an alternative to measuring the performance of an online learner in terms of regret, we can count the number of rounds in which the learner incurs a loss that exceeds a certain threshold. More formally, given ","element":"span"},{"style":{"height":10.4},"width":64.38,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-9.png","element":"img","alt":" ε ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1), we say that the learner makes an ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-10.png","element":"img","alt":" ε","inline":true},{"text":"-mistake in round ","element":"span"},{"style":{"fontStyle":"italic"},"text":"t ","element":"span"},{"text":"if ","element":"span"},{"style":{"height":17.6},"width":260.74,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-11.png","element":"img","alt":" ℓt(ft(xt)) > ε","inline":true},{"text":". The goal of the online learner then becomes to make only a small number of ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-12.png","element":"img","alt":" ε","inline":true},{"text":"-mistakes. Note that this mistake-bounded model of online learning only makes sense in the realizable scenario, since the number of ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-13.png","element":"img","alt":" ε","inline":true},{"text":"-mistakes is in general infinite in the non-realizable scenario.","element":"span"}],[{"text":"To conclude our introductory discussion of online learning, we note a well known connection between the models of regret- and mistake-bounded online learning. Informally, we can say that good regret bounds lead to good mistake bounds, and the next result makes this formal.","element":"span"}],[{"id":"id-83","style":{"fontWeight":"bold"},"text":"Lemma 12 ","element":"span"},{"text":"(From regret to mistake bounds (compare, e.g., Ref. [","element":"span"},{"href":"#id-9","referenceIndex":20,"text":"20","element":"a"},{"text":", Section 3.3] or [","element":"span"},{"href":"#id-128","referenceIndex":107,"text":"107","element":"a"},{"text":", Corollary 2.1.4]))","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":17.53},"width":148.52,"height":43.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-14.png","element":"img","alt":" F ⊆ RX","inline":true},{"text":". Suppose we have an online learner for ","element":"span"},{"style":{"fontStyle":"italic"},"text":"F ","element":"span"},{"text":"that is sequentially presented with ","element":"span"},{"style":{"height":15.6},"width":275.19,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-15.png","element":"img","alt":"x1, . . . , xT ∈ X","inline":true,"padRight":true},{"text":"and losses evaluated according to ","element":"span"},{"style":{"height":17.6},"width":296.09,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-16.png","element":"img","alt":" ℓt(·) = |(·) − bt|","inline":true},{"text":", where there exists ","element":"span"},{"style":{"height":15.6},"width":131.29,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-17.png","element":"img","alt":" f∗ ∈ F","inline":true,"padRight":true},{"text":"such that ","element":"span"},{"style":{"height":17.6},"width":323.83,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-18.png","element":"img","alt":" |bt − f∗(xt)| ≤ ε/","inline":true},{"text":"3 holds for all ","element":"span"},{"style":{"fontStyle":"italic"},"text":"t","element":"span"},{"text":". For an update rule that results in a sequence ","element":"span"},{"style":{"height":15.6},"width":275.41,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-19.png","element":"img","alt":" f1, . . . , fT ∈ F","inline":true,"padRight":true},{"text":"of outputs from the learner, suppose that the regret is bounded as ","element":"span"},{"style":{"height":17.6},"width":251.07,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-20.png","element":"img","alt":" RT ≤ h1(ε, T","inline":true},{"text":") + ","element":"span"},{"style":{"height":17.6},"width":82.38,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-21.png","element":"img","alt":" h2(ε","inline":true},{"text":"), where ","element":"span"},{"style":{"height":14.62},"width":42.14,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-22.png","element":"img","alt":"h1","inline":true,"padRight":true},{"text":": (0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1) ","element":"span"},{"style":{"height":17.02},"width":263.18,"height":42.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-23.png","element":"img","alt":" × N≥1 → R≥0","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":14.62},"width":42.14,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-24.png","element":"img","alt":" h2","inline":true,"padRight":true},{"text":": (0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1) ","element":"span"},{"style":{"height":17.02},"width":130.62,"height":42.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-25.png","element":"img","alt":" → R≥0","inline":true},{"text":". Assume that ","element":"span"},{"style":{"height":17.6},"width":271.88,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-26.png","element":"img","alt":" h1(ε, T) ∈ o(T","inline":true},{"text":") for every fixed ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-27.png","element":"img","alt":" ε","inline":true},{"text":", and that ","element":"span"},{"style":{"height":17.6},"width":676.12,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-28.png","element":"img","alt":" h2(ε) < 2ε/3. Let T ∗ = T ∗(h1, h2, ε","inline":true},{"text":") be the smallest natural number such that","element":"span"}],[{"style":{"width":"69%"},"width":1294,"height":105,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-29.png","element":"img"}],[{"text":"Then, applying the update rule only in rounds in which the learner makes an ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-30.png","element":"img","alt":" ε","inline":true},{"text":"-mistake, and outputting the previous hypothesis otherwise, leads to a total number of ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-31.png","element":"img","alt":" ε","inline":true},{"text":"-mistakes bounded by ","element":"span"},{"style":{"height":15.53},"width":62.49,"height":38.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/23-32.png","element":"img","alt":" T ∗,","inline":true,"padRight":true},{"text":"independently of the overall number of rounds.","element":"span"}],[{"text":"Notice that the online learning procedure achieving the mistake bound ","element":"span"},{"style":{"height":12.33},"width":48.56,"height":30.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-0.png","element":"img","alt":" T ∗","inline":true,"padRight":true},{"text":"in Lemma ","element":"span"},{"href":"#id-83","text":"12 ","element":"a"},{"text":"is ","element":"span"},{"style":{"fontStyle":"italic"},"text":"mistake-driven","element":"span"},{"text":": It updates the hypothesis only when an ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-1.png","element":"img","alt":" ε","inline":true},{"text":"-mistake is made. After a round without an ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-2.png","element":"img","alt":" ε","inline":true},{"text":"-mistake, the learner just proceeds without changing their hypothesis.","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":12},"width":40.56,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-3.png","element":"img","alt":" T ′ ","inline":true,"padRight":true},{"text":"denote the number of ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-4.png","element":"img","alt":" ε","inline":true},{"text":"-mistakes (i.e., the number of updates) and let us focus on the subsequence of rounds in which the learner makes a mistake. First note that, by assumption, there is a function ","element":"span"},{"style":{"height":15.6},"width":131.43,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-5.png","element":"img","alt":" f∗ ∈ F","inline":true,"padRight":true},{"text":"such that the prediction ","element":"span"},{"style":{"height":17.6},"width":89.84,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-6.png","element":"img","alt":" ft(xt","inline":true},{"text":") achieves loss ","element":"span"},{"style":{"height":17.6},"width":268.17,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-7.png","element":"img","alt":" ℓ(f∗(xt)) ≤ ε/","inline":true},{"text":"3 for all ","element":"span"},{"style":{"fontStyle":"italic"},"text":"t","element":"span"},{"text":". Hence, the optimal accumulated loss after ","element":"span"},{"style":{"height":12},"width":40.56,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-8.png","element":"img","alt":" T ′","inline":true,"padRight":true},{"text":"rounds is ","element":"span"},{"style":{"height":17.6},"width":131.19,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-9.png","element":"img","alt":" ≤ T ′ε/","inline":true},{"text":"3. As the described procedure applies the update rule only when a loss ","element":"span"},{"style":{"height":10.4},"width":67.08,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-10.png","element":"img","alt":" > ε","inline":true,"padRight":true},{"text":"is incurred, this mistake-driven online learning procedure incurs an accumulated loss of ","element":"span"},{"style":{"height":12.8},"width":109.8,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-11.png","element":"img","alt":" > T ′ε","inline":true},{"text":". Thus, its regret is ","element":"span"},{"style":{"height":17.6},"width":153.2,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-12.png","element":"img","alt":" > 2T ′ε/","inline":true},{"text":"3. Comparing this with the assumed regret upper bound of ","element":"span"},{"style":{"height":17.6},"width":481.14,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-13.png","element":"img","alt":" RT ′ ≤ h1(ε, T ′) + h2(ε)T ′ ","inline":true,"padRight":true},{"text":"and rearranging, we see that ","element":"span"},{"style":{"height":27.47},"width":507.96,"height":68.67,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-14.png","element":"img","alt":" T ′ ≤ h1(ε,T ′)(2ε/3)−h2(ε). Thus, by","inline":true,"padRight":true},{"text":"the definition of ","element":"span"},{"style":{"height":12.33},"width":48.56,"height":30.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-15.png","element":"img","alt":" T ∗","inline":true},{"text":", we conclude ","element":"span"},{"style":{"height":14.73},"width":229.17,"height":36.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-16.png","element":"img","alt":" T ′ ≤ T ∗. ■","inline":true}],[{"id":"id-101","style":{"fontWeight":"bold"},"text":"2.3 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"The multiplicative weights framework and corresponding guarantees","element":"span"}],[{"text":"In this section, we provide a brief overview of the multiplicative weights framework and refer to Ref. [","element":"span"},{"href":"#id-84","referenceIndex":84,"text":"84","element":"a"},{"text":"] for a more comprehensive review. This toolkit has a variety of applications in areas such as game theory and economics [","element":"span"},{"href":"#id-129","referenceIndex":108,"text":"108","element":"a"},{"text":"], machine learning [","element":"span"},{"href":"#id-20","referenceIndex":37,"text":"37","element":"a"},{"text":", ","element":"span"},{"href":"#id-130","referenceIndex":109,"text":"109","element":"a"},{"text":"], and semidefinite programming [","element":"span"},{"href":"#id-131","referenceIndex":110,"text":"110","element":"a"},{"text":", ","element":"span"},{"href":"#id-132","referenceIndex":111,"text":"111","element":"a"},{"text":"]. In quantum computing, the matrix multiplicative weights algorithm was used to prove that the complexity classes QIP (problems with a quantum interactive proof system) and PSPACE (problems solvable in polynomial amount of space) coincide [","element":"span"},{"href":"#id-133","referenceIndex":112,"text":"112","element":"a"},{"text":"].","element":"span"}],[{"text":"Consider an interactive game, in which a learner is tasked with picking the best option from a set of ","element":"span"},{"style":{"height":12.8},"width":109.62,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-17.png","element":"img","alt":" d ∈ N","inline":true,"padRight":true},{"text":"decisions over ","element":"span"},{"style":{"height":12.8},"width":118.48,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-18.png","element":"img","alt":" T ∈ N","inline":true,"padRight":true},{"text":"rounds of interaction. In each round 1 ","element":"span"},{"style":{"height":14.4},"width":153.38,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-19.png","element":"img","alt":" ≤ t ≤ T","inline":true},{"text":", every decision ","element":"span"},{"style":{"height":17.6},"width":294.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-20.png","element":"img","alt":"i ∈ {1, 2, . . . , d}","inline":true,"padRight":true},{"text":"is associated with a cost ","element":"span"},{"style":{"height":23.77},"width":212.21,"height":59.42,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-21.png","element":"img","alt":" m(t)i ∈ [−1,","inline":true,"padRight":true},{"text":"1]. Upon making their decision, the learner is informed of the associated cost, which they can use to make their decisions in subsequent rounds. The learner’s goal is to output a sequence of decisions over the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T ","element":"span"},{"text":"rounds such that their total accumulated cost after ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T ","element":"span"},{"text":"rounds is minimized.","element":"span"}],[{"text":"In the multiplicative weights framework, the decision in round 1 ","element":"span"},{"style":{"height":14.4},"width":151,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-22.png","element":"img","alt":" ≤ t ≤ T","inline":true,"padRight":true},{"text":"is made according to a ","element":"span"},{"style":{"fontStyle":"italic"},"text":"d","element":"span"},{"text":"-dimensional probability vector ","element":"span"},{"style":{"height":24.19},"width":449.72,"height":60.48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-23.png","element":"img","alt":" p(t) = (p(t)1 , p(t)2 , . . . , p(t)d","inline":true,"padRight":true},{"text":"). Explicitly, ","element":"span"},{"style":{"height":23.77},"width":60.36,"height":59.42,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-24.png","element":"img","alt":" p(t)i","inline":true,"padRight":true},{"text":"is the probability of making the decision ","element":"span"},{"style":{"height":17.6},"width":295.37,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-25.png","element":"img","alt":" i ∈ {1, 2, . . . , d}","inline":true},{"text":". The expected cost of the distribution ","element":"span"},{"style":{"height":19.93},"width":64.63,"height":49.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-26.png","element":"img","alt":" p(t)","inline":true,"padRight":true},{"text":"is then ","element":"span"},{"style":{"height":19.93},"width":181.74,"height":49.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-27.png","element":"img","alt":" m(t) · p(t)","inline":true},{"text":", where ","element":"span"},{"style":{"height":24.19},"width":412.77,"height":60.48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-28.png","element":"img","alt":" m(t) = (m(t)1 , . . . , m(t)d ","inline":true,"padRight":true},{"text":") is the vector of costs. The expected accumulated cost after ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T ","element":"span"},{"text":"rounds ","element":"span"},{"text":"is thus given by ","element":"span"},{"style":{"height":20.49},"width":396.09,"height":51.23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-29.png","element":"img","alt":"�Tt=1 m(t) · p(t). The","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"multiplicative weights update ","element":"span"},{"text":"(MWU) algorithm, presented in ","element":"span"},{"text":"Algorithm ","element":"span"},{"href":"#id-134","text":"1","element":"a"},{"text":", is a method for obtaining a sequence of probability distributions over decisions based on the costs incurred in the previous rounds.","element":"span"}],[{"text":"Arora, Hazan and Kale [","element":"span"},{"href":"#id-84","referenceIndex":84,"text":"84","element":"a"},{"text":"] have shown that if a learner makes their decisions according to the MWU algorithm, then their expected accumulated cost only grows logarithmically in ","element":"span"},{"style":{"fontStyle":"italic"},"text":"d","element":"span"},{"text":", the number of possible decisions.","element":"span"}],[{"id":"id-154","style":{"fontWeight":"bold"},"text":"Theorem 13 ","element":"span"},{"text":"([","element":"span"},{"href":"#id-84","referenceIndex":84,"text":"84","element":"a"},{"text":", Theorem 2.1 & Corollary 2.2])","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Consider the setting of an interactive game, as described above, with cost vectors ","element":"span"},{"style":{"height":16.33},"width":83.46,"height":40.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-30.png","element":"img","alt":" m(t)","inline":true,"padRight":true},{"text":"satisfying ","element":"span"},{"style":{"height":23.77},"width":212.61,"height":59.42,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-31.png","element":"img","alt":" m(t)i ∈ [−1,","inline":true,"padRight":true},{"text":"1] for all ","element":"span"},{"style":{"height":17.6},"width":295.26,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-32.png","element":"img","alt":" i ∈ {1, 2, . . . , d}","inline":true,"padRight":true},{"text":"and for all ","element":"span"},{"style":{"height":17.6},"width":502.53,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-33.png","element":"img","alt":"t ∈ {1, 2, . . . , T}, and let q","inline":true,"padRight":true},{"text":"be an arbitrary probability distribution over the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"d ","element":"span"},{"text":"decisions. Using the MWU algorithm (Algorithm ","element":"span"},{"href":"#id-134","text":"1","element":"a"},{"text":"), the expected accumulated cost over ","element":"span"},{"style":{"height":12.8},"width":116.97,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/24-34.png","element":"img","alt":" T ∈ N","inline":true,"padRight":true},{"text":"rounds is bounded from","element":"span"}],[{"id":"id-134","style":{"width":"99%"},"width":1873,"height":982,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/25-0.png","element":"img"}],[{"text":"above as","element":"span"}],[{"style":{"width":"87%"},"width":1641,"height":122,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/25-1.png","element":"img"}],[{"text":"The ","element":"span"},{"style":{"fontStyle":"italic"},"text":"matrix multiplicative weights ","element":"span"},{"text":"(MMW) algorithm [","element":"span"},{"href":"#id-131","referenceIndex":110,"text":"110","element":"a"},{"text":", ","element":"span"},{"href":"#id-135","referenceIndex":113,"text":"113","element":"a"},{"text":"] is a generalization of the MWU algorithm to costs specified ","element":"span"},{"style":{"height":12},"width":94.18,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/25-2.png","element":"img","alt":" d × d","inline":true,"padRight":true},{"text":"Hermitian matrices ","element":"span"},{"style":{"height":15.94},"width":68.1,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/25-3.png","element":"img","alt":" L(t) ","inline":true,"padRight":true},{"text":"that satisfy ","element":"span"},{"style":{"height":20.3},"width":310.6,"height":50.74,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/25-4.png","element":"img","alt":" −1d ≤ L(t) ≤ 1d","inline":true},{"text":"; equivalently, ","element":"span"},{"style":{"height":20.33},"width":195.89,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/25-5.png","element":"img","alt":"∥L(t)∥∞ ≤","inline":true,"padRight":true},{"text":"1. Here, the decisions are described by density operators ","element":"span"},{"style":{"height":15.93},"width":67.14,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/25-6.png","element":"img","alt":" ω(t)","inline":true},{"text":", and the expected cost in round ","element":"span"},{"style":{"fontStyle":"italic"},"text":"t ","element":"span"},{"text":"is equal to Tr[","element":"span"},{"style":{"height":15.93},"width":137.4,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/25-7.png","element":"img","alt":"L(t)ω(t)","inline":true},{"text":"]. The MMW algorithm, presented in Algorithm ","element":"span"},{"href":"#id-134","text":"2","element":"a"},{"text":", is a method for obtaining a sequence of density operators ","element":"span"},{"style":{"height":19.14},"width":343.7,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/25-8.png","element":"img","alt":" ω(1), ω(2), . . . , ω(T)","inline":true},{"text":", based on the costs incurred in the previous rounds.","element":"span"}],[{"text":"A bound on the expected accumulated cost for the MMW algorithm has been shown in Ref. [","element":"span"},{"href":"#id-131","referenceIndex":110,"text":"110","element":"a"},{"text":", Theorem 3.1]. By modifying the arguments in Ref. [","element":"span"},{"href":"#id-131","referenceIndex":110,"text":"110","element":"a"},{"text":"] via use of the relative entropy, we provide a bound on the expected accumulated cost for the MMW algorithm that can in general be tighter than the one obtained in Ref. [","element":"span"},{"href":"#id-131","referenceIndex":110,"text":"110","element":"a"},{"text":", Theorem 3.1].","element":"span"}],[{"id":"id-139","style":{"fontWeight":"bold"},"text":"Proposition 14 ","element":"span"},{"text":"(Bound on the expected accumulated cost for the MMW algorithm)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":11.2},"width":23,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/25-9.png","element":"img","alt":" ρ","inline":true,"padRight":true},{"text":"be an arbitrary density operator. Let ","element":"span"},{"style":{"height":12.8},"width":116.9,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/25-10.png","element":"img","alt":" T ∈ N","inline":true,"padRight":true},{"text":"be the number of rounds of interaction, and consider a sequence ","element":"span"},{"style":{"height":19.13},"width":346.6,"height":47.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/25-11.png","element":"img","alt":" L(1), L(2), . . . , L(T)","inline":true,"padRight":true},{"text":"of cost matrices in dimension ","element":"span"},{"style":{"height":17.6},"width":265.74,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/25-12.png","element":"img","alt":" d ∈ {2, 3, . . . }","inline":true,"padRight":true},{"text":"along with the updates ","element":"span"},{"style":{"height":15.93},"width":67.13,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/25-13.png","element":"img","alt":"ω(t)","inline":true,"padRight":true},{"text":"provided by the MMW algorithm in Algorithm ","element":"span"},{"href":"#id-134","text":"2","element":"a"},{"text":". Then, the expected accumulated cost over the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T ","element":"span"},{"text":"rounds is bounded from above as","element":"span"}],[{"id":"id-136","style":{"width":"86%"},"width":1622,"height":123,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/25-14.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":17.6},"width":371.39,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/25-15.png","element":"img","alt":" H(ρ) := −Tr[ρ log ρ","inline":true},{"text":"] is the von Neumann entropy of ","element":"span"},{"style":{"height":11.2},"width":34.56,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/25-16.png","element":"img","alt":" ρ.","inline":true}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"See Appendix ","element":"span"},{"text":"D","element":"span"},{"text":", where we also describe how the bound in Ref. [","element":"span"},{"href":"#id-131","referenceIndex":110,"text":"110","element":"a"},{"text":", Theorem 3.1] arises as a special case of our bound in Equation (","element":"span"},{"href":"#id-136","text":"2.44","element":"a"},{"text":"). ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-0.png","element":"img","alt":"■","inline":true}],[{"style":{"fontWeight":"bold"},"text":"Remark 15 ","element":"span"},{"text":"(The Hedge algorithm)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"If the cost matrices ","element":"span"},{"style":{"height":15.93},"width":68.1,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-1.png","element":"img","alt":" L(t)","inline":true,"padRight":true},{"text":"in the MMW algorithm are all diagonal in the same basis, then Algorithm ","element":"span"},{"href":"#id-134","text":"2 ","element":"a"},{"text":"reduces to the so-called ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Hedge ","element":"span"},{"text":"algorithm, introduced by Freund and Schapire [","element":"span"},{"href":"#id-137","referenceIndex":114,"text":"114","element":"a"},{"text":"]. We state this algorithm in Appendix ","element":"span"},{"text":"D","element":"span"},{"text":", and in Corollary ","element":"span"},{"href":"#id-138","text":"61 ","element":"a"},{"text":"we state the Hedge algorithm counterpart to Proposition ","element":"span"},{"href":"#id-139","text":"14 ","element":"a"},{"text":"above. ","element":"span"},{"style":{"height":10.4},"width":34,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-2.png","element":"img","alt":" ◀","inline":true}],[{"id":"id-27","style":{"fontWeight":"bold"},"text":"2.4 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Problem statement: Online learning classes of quantum channels","element":"span"}],[{"text":"Our task is to online learn a class ","element":"span"},{"style":{"height":15.02},"width":305.08,"height":37.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-3.png","element":"img","alt":" C ⊆ CPTPn of n","inline":true},{"text":"-qubit quantum channels, in the sense of predicting the quantities Tr[","element":"span"},{"style":{"height":26.85},"width":330.53,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-4.png","element":"img","alt":"M(t)R,BNA→B(ρ(t)R,A","inline":true},{"text":")], with the state-measurement pairs (","element":"span"},{"style":{"height":26.85},"width":207.88,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-5.png","element":"img","alt":"ρ(t)R,A, M(t)R,B","inline":true},{"text":") provided by ","element":"span"},{"text":"an adversary, where ","element":"span"},{"style":{"fontStyle":"italic"},"text":"R ","element":"span"},{"text":"is an arbitrary finite-dimensional reference system; see Figure ","element":"span"},{"href":"#id-15","text":"1","element":"a"},{"text":". Let us now cast this problem in terms of the general framework of online learning laid out in Section ","element":"span"},{"href":"#id-100","text":"2.2","element":"a"},{"text":".","element":"span"}],[{"text":"We fix finite-dimensional Hilbert spaces ","element":"span"},{"style":{"height":14.7},"width":226.17,"height":36.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-6.png","element":"img","alt":" HA and HB","inline":true},{"text":". Then, the input set/domain ","element":"span"},{"text":"X ","element":"span"},{"text":"comprises state-measurement pairs (","element":"span"},{"style":{"height":18.3},"width":410.12,"height":45.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-7.png","element":"img","alt":"ρR,A, MR,B), where R","inline":true,"padRight":true},{"text":"is an arbitrary finite reference system. Equivalently, due to (","element":"span"},{"href":"#id-140","text":"2.19","element":"a"},{"text":"), ","element":"span"},{"text":"X ","element":"span"},{"text":"comprises channel test operators ","element":"span"},{"style":{"height":17.5},"width":93,"height":43.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-8.png","element":"img","alt":" EA,B","inline":true},{"text":". Precisely,","element":"span"}],[{"style":{"width":"93%"},"width":1753,"height":167,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-9.png","element":"img"}],[{"text":"where in the second equality we have used Equation (","element":"span"},{"href":"#id-121","text":"2.26","element":"a"},{"text":"). The output set is ","element":"span"},{"text":"Y ","element":"span"},{"text":"= [0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1], and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"F ","element":"span"},{"text":"is defined by the subclass ","element":"span"},{"style":{"height":15.02},"width":220.56,"height":37.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-10.png","element":"img","alt":" C ⊆ CPTPn","inline":true,"padRight":true},{"text":"of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit channels (in which case ","element":"span"},{"style":{"height":19.54},"width":378.48,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-11.png","element":"img","alt":" HA ∼= HB ∼= (C2)⊗n","inline":true},{"text":"), such that for every ","element":"span"},{"style":{"height":15.2},"width":127.73,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-12.png","element":"img","alt":" N ∈ C","inline":true,"padRight":true},{"text":"we define the function ","element":"span"},{"style":{"height":15.9},"width":55.36,"height":39.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-13.png","element":"img","alt":" fN","inline":true,"padRight":true},{"text":"as ","element":"span"},{"style":{"height":18.4},"width":378.09,"height":46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-14.png","element":"img","alt":" fN (E) = Tr[EC(N","inline":true},{"text":")] for all ","element":"span"},{"style":{"height":13.2},"width":121.22,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-15.png","element":"img","alt":" E ∈ X","inline":true},{"text":". In other words,","element":"span"}],[{"style":{"width":"81%"},"width":1529,"height":72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-16.png","element":"img"}],[{"text":"Notably, due to Equation (","element":"span"},{"href":"#id-140","text":"2.19","element":"a"},{"text":"), we can see that our online learning task is equivalently formulated as the task of learning a class ","element":"span"},{"style":{"height":12.4},"width":91.42,"height":31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-17.png","element":"img","alt":" C′ of","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"Choi matrices ","element":"span"},{"text":"defined by the channels in ","element":"span"},{"text":"C","element":"span"},{"text":", where","element":"span"}],[{"style":{"width":"71%"},"width":1339,"height":73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-18.png","element":"img"}],[{"text":"We therefore view the adversary as providing channel test operators ","element":"span"},{"style":{"height":26.85},"width":92.99,"height":67.14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-19.png","element":"img","alt":" E(t)A,B","inline":true,"padRight":true},{"text":"and having the learner ","element":"span"},{"text":"predict the quantities Tr[","element":"span"},{"style":{"height":26.85},"width":188.08,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-20.png","element":"img","alt":"E(t)A,BCNA,B","inline":true},{"text":"] based on hypotheses for the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Choi matrix ","element":"span"},{"text":"of the unknown ","element":"span"},{"text":"channel. We denote the learner’s hypothesis Choi matrices by ","element":"span"},{"style":{"height":26.85},"width":496.57,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-21.png","element":"img","alt":" N(t)A,B for t ∈ {1, 2, . . . , T}.","inline":true}],[{"text":"As outlined in Section ","element":"span"},{"href":"#id-100","text":"2.2","element":"a"},{"text":", we will evaluate the performance of an online learner in terms of either the regret or the number of mistakes. That is, on the one hand, the learner aims to achieve a small regret","element":"span"}],[{"id":"id-142","style":{"width":"78%"},"width":1467,"height":121,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-22.png","element":"img"}],[{"text":"where the losses ","element":"span"},{"style":{"height":8},"width":30.18,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-23.png","element":"img","alt":" ℓt","inline":true,"padRight":true},{"text":"are revealed by the adversary. More precisely, we will aim for regret bounds scaling as ","element":"span"},{"style":{"height":19.21},"width":298.37,"height":48.02,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/26-24.png","element":"img","alt":" O(�T · poly(n)","inline":true},{"text":"). On the other hand, in the realizable scenario, where there exists a (to","element":"span"}],[{"text":"the learner unknown) channel ","element":"span"},{"style":{"height":16.7},"width":205.5,"height":41.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-0.png","element":"img","alt":" NA→B ∈ C","inline":true,"padRight":true},{"text":"such that all losses take the form","element":"span"}],[{"id":"id-141","style":{"width":"59%"},"width":1107,"height":45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-1.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":0},"width":12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-2.png","element":"img","alt":" ℓ","inline":true,"padRight":true},{"text":"is some fixed function and each ","element":"span"},{"style":{"height":14.62},"width":30.73,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-3.png","element":"img","alt":" bt","inline":true,"padRight":true},{"text":"satisfies ","element":"span"},{"style":{"height":26.85},"width":472.04,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-4.png","element":"img","alt":" |bt − Tr[E(t)A,BCNA,B]| ≤ ε/","inline":true},{"text":"3, the learner aims to ","element":"span"},{"text":"achieve a small number of ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-5.png","element":"img","alt":" ε","inline":true},{"text":"-mistakes. Here, our goal will be to guarantee mistake bounds scaling as ","element":"span"},{"style":{"height":19.13},"width":261.5,"height":47.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-6.png","element":"img","alt":"O(poly(n, ε−1","inline":true},{"text":")). We can summarize the formulation of our channel online learning task as follows:","element":"span"}],[{"id":"id-143","style":{"fontWeight":"bold"},"text":"Problem 1 ","element":"span"},{"text":"(Online learning classes of quantum channels)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Consider a subset ","element":"span"},{"style":{"height":15.02},"width":235.09,"height":37.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-7.png","element":"img","alt":" C ⊆ CPTPn","inline":true,"padRight":true},{"text":"of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit quantum channels, and let ","element":"span"},{"style":{"height":16.7},"width":205.53,"height":41.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-8.png","element":"img","alt":" NA→B ∈ C","inline":true},{"text":", with Choi representation ","element":"span"},{"style":{"height":24.06},"width":185.86,"height":60.16,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-9.png","element":"img","alt":" CNA,B ∈ C′","inline":true},{"text":", be unknown. ","element":"span"},{"text":"Given a sequence of ","element":"span"},{"style":{"height":12.8},"width":136.78,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-10.png","element":"img","alt":" T ∈ N","inline":true,"padRight":true},{"text":"interactive rounds, in which two-outcome channel test operators ","element":"span"},{"style":{"height":26.85},"width":494.29,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-11.png","element":"img","alt":"E(1)A,B, E(2)A,B, · · · , E(T)A,B ∈ X","inline":true,"padRight":true},{"text":"are presented sequentially by an adversary, the problem is to output a ","element":"span"},{"text":"sequence of Choi matrices ","element":"span"},{"style":{"height":26.85},"width":502.88,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-12.png","element":"img","alt":" N(1)A,B, N(2)A,B, . . . , N(T)A,B ∈ C′","inline":true},{"text":", such that for losses ","element":"span"},{"style":{"height":8},"width":30.18,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-13.png","element":"img","alt":" ℓt","inline":true,"padRight":true},{"text":"as defined in (","element":"span"},{"href":"#id-141","text":"2.49","element":"a"},{"text":"), the ","element":"span"},{"text":"regret in (","element":"span"},{"href":"#id-142","text":"2.48","element":"a"},{"text":") scales as ","element":"span"},{"style":{"height":19.21},"width":298.1,"height":48.02,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-14.png","element":"img","alt":" O(�T · poly(n)","inline":true},{"text":") and the number of ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-15.png","element":"img","alt":" ε","inline":true},{"text":"-mistakes scales as ","element":"span"},{"style":{"height":19.13},"width":354.38,"height":47.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-16.png","element":"img","alt":" O(poly(n, ε−1)). ◀","inline":true}],[{"style":{"fontWeight":"bold"},"text":"Remark 16. ","element":"span"},{"text":"In this work, we primarily consider the case that the input system dimension ","element":"span"},{"style":{"height":14.7},"width":47.71,"height":36.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-17.png","element":"img","alt":" dA","inline":true,"padRight":true},{"text":"and the output system dimension ","element":"span"},{"style":{"height":14.7},"width":48.71,"height":36.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-18.png","element":"img","alt":" dB","inline":true,"padRight":true},{"text":"are equal and satisfy ","element":"span"},{"style":{"height":14.7},"width":341.98,"height":36.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-19.png","element":"img","alt":" dA = dB = d = 2n","inline":true},{"text":", although Problem ","element":"span"},{"href":"#id-143","text":"1 ","element":"a"},{"text":"applies also to quantum channels with different input and output system dimensions. In particular, if ","element":"span"},{"style":{"height":14.7},"width":489.22,"height":36.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-20.png","element":"img","alt":" dA = 1 and dB = d = 2n","inline":true},{"text":", then every channel is a state-preparation channel for some ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit quantum state, and Problem ","element":"span"},{"href":"#id-143","text":"1 ","element":"a"},{"text":"reduces to online learning of quantum states, as considered in Ref. [","element":"span"},{"href":"#id-9","referenceIndex":20,"text":"20","element":"a"},{"text":"]. ","element":"span"},{"style":{"height":10.4},"width":34,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-21.png","element":"img","alt":" ◀","inline":true}],[{"style":{"fontWeight":"bold"},"text":"Remark 17. ","element":"span"},{"text":"Successfully solving Problem ","element":"span"},{"href":"#id-143","text":"1 ","element":"a"},{"text":"does not imply learning of the unknown channel with respect to the diamond norm, i.e., the learner’s hypotheses ","element":"span"},{"style":{"height":26.85},"width":95.84,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-22.png","element":"img","alt":" N(t)A,B","inline":true,"padRight":true},{"text":"could be very far from the true ","element":"span"},{"text":"Choi matrix ","element":"span"},{"style":{"height":24.07},"width":91.97,"height":60.16,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-23.png","element":"img","alt":" CNA,B ","inline":true,"padRight":true},{"text":"with respect to the strategy 1-norm in (","element":"span"},{"href":"#id-123","text":"2.25","element":"a"},{"text":"). In our scenario, the learner’s only ","element":"span"},{"text":"goal is to ensure that their hypotheses are such that Tr[","element":"span"},{"style":{"height":26.85},"width":191.95,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-24.png","element":"img","alt":"E(t)A,BN(t)A,B","inline":true},{"text":"] well approximates Tr[","element":"span"},{"style":{"height":26.85},"width":203.19,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-25.png","element":"img","alt":"E(t)A,BCNA,B]","inline":true,"padRight":true},{"text":"in most rounds, and this can be achieved by hypotheses that are not necessarily close to the true Choi matrix with respect to the strategy 1-norm. ","element":"span"},{"style":{"height":10.4},"width":34,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-26.png","element":"img","alt":" ◀","inline":true}],[{"id":"id-63","style":{"fontWeight":"bold"},"text":"2.5 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Obstacles to online learning via the Choi state","element":"span"}],[{"text":"As pointed out above, the problem of online learning classes of quantum channels (Problem ","element":"span"},{"href":"#id-143","text":"1","element":"a"},{"text":") is equivalent to the problem of online learning classes of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Choi matrices","element":"span"},{"text":". A natural first strategy for solving this problem might then be to simply online learn the Choi ","element":"span"},{"style":{"fontStyle":"italic"},"text":"state ","element":"span"},{"text":"of the unknown channel using the protocols presented in Ref. [","element":"span"},{"href":"#id-9","referenceIndex":20,"text":"20","element":"a"},{"text":"] for online learning of quantum states, such as the MMW algorithm (Algorithm ","element":"span"},{"href":"#id-134","text":"2","element":"a"},{"text":"). However, we immediately encounter two issues.","element":"span"}],[{"text":"First, the MMW algorithm (Algorithm ","element":"span"},{"href":"#id-134","text":"2","element":"a"},{"text":") cannot be applied out of the box: while Choi states Φ","element":"span"},{"style":{"height":24.06},"width":60.78,"height":60.16,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-27.png","element":"img","alt":"NA,B","inline":true,"padRight":true},{"text":"of quantum channels ","element":"span"},{"style":{"height":16.7},"width":121.05,"height":41.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-28.png","element":"img","alt":" NA→B","inline":true,"padRight":true},{"text":"have unit trace, they also have to satisfy Tr","element":"span"},{"style":{"height":25.18},"width":335.71,"height":62.96,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-29.png","element":"img","alt":"B[ΦNA,B] = 1A/dA","inline":true},{"text":", ","element":"span"},{"text":"which the iterates of Algorithm ","element":"span"},{"href":"#id-134","text":"2 ","element":"a"},{"text":"will generally not guarantee. Furthermore, the proof of Ref. [","element":"span"},{"href":"#id-131","referenceIndex":110,"text":"110","element":"a"},{"text":", Theorem 3.1], as well as the proof of Proposition ","element":"span"},{"href":"#id-139","text":"14 ","element":"a"},{"text":"above, relies crucially on the fact that the iterates of the MMW algorithm have unit trace. So, potential modifications of the update rule would have to simultaneously ensure the unit trace and the partial trace conditions.","element":"span"}],[{"text":"We can modify the MMW algorithm by adding a ","element":"span"},{"style":{"fontStyle":"italic"},"text":"projection step","element":"span"},{"text":": for every iterate ","element":"span"},{"style":{"height":26.85},"width":229.08,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-30.png","element":"img","alt":" W (t)A,B of the","inline":true,"padRight":true},{"text":"MMW algorithm, we find the closest Choi state ","element":"span"},{"style":{"height":26.85},"width":83.34,"height":67.14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/27-31.png","element":"img","alt":" ρ(t)A,B ","inline":true,"padRight":true},{"text":"with respect to relative entropy and use that","element":"span"}],[{"text":"as our hypothesis for the unknown Choi state. In other words, we let","element":"span"}],[{"style":{"width":"81%"},"width":1523,"height":102,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/28-0.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":26.85},"width":455.3,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/28-1.png","element":"img","alt":" ω(t)A,B = W (t)A,B/Tr[W (t)A,B","inline":true},{"text":"] and the relative entropy ","element":"span"},{"style":{"height":17.6},"width":145.54,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/28-2.png","element":"img","alt":" D(P∥Q","inline":true},{"text":") of two positive semi-definite ","element":"span"},{"text":"operators ","element":"span"},{"style":{"fontStyle":"italic"},"text":"P ","element":"span"},{"text":"and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Q ","element":"span"},{"text":"is defined as [","element":"span"},{"href":"#id-110","referenceIndex":101,"text":"101","element":"a"},{"text":"]","element":"span"}],[{"id":"id-230","style":{"width":"81%"},"width":1522,"height":121,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/28-3.png","element":"img"}],[{"text":"We present the modified MMW algorithm in Algorithm ","element":"span"},{"href":"#id-144","text":"3","element":"a"},{"text":". Then, because the relative entropy is a Bregman divergence [","element":"span"},{"href":"#id-145","referenceIndex":115,"text":"115","element":"a"},{"text":"], we can regard Algorithm ","element":"span"},{"href":"#id-144","text":"3 ","element":"a"},{"text":"as a particular instance of the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"online mirror descent ","element":"span"},{"text":"algorithm [","element":"span"},{"href":"#id-28","referenceIndex":46,"text":"46","element":"a"},{"text":", Section 5.3]. Consequently, we obtain a regret bound for Choi states that is similar to the regret bound in Proposition ","element":"span"},{"href":"#id-139","text":"14","element":"a"},{"text":", thus similar to the MMW regret bound in Ref. [","element":"span"},{"href":"#id-9","referenceIndex":20,"text":"20","element":"a"},{"text":"] for online learning of quantum states. We provide the details in Appendix ","element":"span"},{"href":"#id-109","text":"D.2","element":"a"},{"text":".","element":"span"}],[{"id":"id-144","style":{"width":"99%"},"width":1873,"height":524,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/28-4.png","element":"img"}],[{"text":"The second issue is that the above strategy of learning the Choi state of the unknown channel leads to favorable regret and mistake bounds, but only as long as we modify Problem ","element":"span"},{"href":"#id-143","text":"1 ","element":"a"},{"text":"as follows: instead of predicting expectation values of the form Tr[","element":"span"},{"style":{"height":24.07},"width":442.56,"height":60.16,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/28-5.png","element":"img","alt":"EA,BCNA,B], where CNA,B ","inline":true,"padRight":true},{"text":"is the Choi matrix of ","element":"span"},{"text":"the unknown channel ","element":"span"},{"style":{"height":16.7},"width":121.04,"height":41.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/28-6.png","element":"img","alt":" NA→B","inline":true},{"text":", we aim to predict values of the form Tr[","element":"span"},{"style":{"height":24.06},"width":562.71,"height":60.16,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/28-7.png","element":"img","alt":"EA,BΦNA,B], where ΦNA,B is the","inline":true,"padRight":true},{"text":"Choi state of ","element":"span"},{"style":{"height":16.7},"width":121.05,"height":41.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/28-8.png","element":"img","alt":" NA→B","inline":true},{"text":". Now, because of Equation (","element":"span"},{"href":"#id-116","text":"1.6","element":"a"},{"text":"), we see that Tr[","element":"span"},{"style":{"height":24.07},"width":582.48,"height":60.16,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/28-9.png","element":"img","alt":"EA,BCNA,B] = dATr[EA,BΦNA,B].","inline":true,"padRight":true},{"text":"Thus, even though we show that the ","element":"span"},{"style":{"height":19.94},"width":176.19,"height":49.85,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/28-10.png","element":"img","alt":" O(L√Tn","inline":true},{"text":") regret bound of [","element":"span"},{"href":"#id-9","referenceIndex":20,"text":"20","element":"a"},{"text":"] carries over to (properly) online learning the Choi state, this implies only a regret bound of ","element":"span"},{"style":{"height":19.94},"width":504.98,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/28-11.png","element":"img","alt":" O(dL√Tn) = O(2nL√Tn","inline":true},{"text":") when it comes to our actual task of learning the Choi matrix (and thus the channel), which has a favorable square root scaling in ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T ","element":"span"},{"text":"but an unfavorable exponential scaling in the number of qubits","element":"span"},{"style":{"height":14.73},"width":30.93,"height":36.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/28-12.png","element":"img","alt":"5.","inline":true}]]},{"heading":"3 Online learning upper bounds","paragraphs":[[{"id":"id-102","style":{"fontWeight":"bold"},"text":"3.1 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Regret bound for channels of bounded gate complexity","element":"span"}],[{"text":"In this section, we show online learnability for quantum channels of bounded gate complexity","element":"span"},{"style":{"height":14.73},"width":109.96,"height":36.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/29-0.png","element":"img","alt":"6. We","inline":true,"padRight":true},{"text":"say that a quantum channel ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":"has (exact) gate complexity (at most) ","element":"span"},{"style":{"height":12.8},"width":125.76,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/29-1.png","element":"img","alt":" G ∈ N","inline":true,"padRight":true},{"text":"[","element":"span"},{"href":"#id-146","referenceIndex":116,"text":"116","element":"a"},{"text":", ","element":"span"},{"href":"#id-147","referenceIndex":117,"text":"117","element":"a"},{"text":"] if there exists a (not necessarily geometrically) two-local","element":"span"},{"style":{"height":8.8},"width":17,"height":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/29-2.png","element":"img","alt":"7","inline":true,"padRight":true},{"text":"quantum circuit with ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"two-qubit channels as gates that implements ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N","element":"span"},{"text":". Such quantum channels include, for example, noisy quantum circuits modeled in terms of perfect two-qubit unitary gates followed by single-qubit noise channels. The set of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit channels of gate complexity (at most) ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"is denoted by ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/29-3.png","element":"img","alt":" CPTPn,G","inline":true},{"text":". We abuse notation and also use ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/29-4.png","element":"img","alt":" CPTPn,G","inline":true,"padRight":true},{"text":"to denote the class of [0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1]-valued functions on channel test operators that arises from ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":"-gate channels as described in Section ","element":"span"},{"href":"#id-27","text":"2.4","element":"a"},{"text":". We first bound the sequential covering numbers for (the [0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1]-valued function classes associated to) such channels. Theorem ","element":"span"},{"href":"#id-80","text":"11 ","element":"a"},{"text":"then gives us a regret bound, from which we can derive a mistake bound via Lemma ","element":"span"},{"href":"#id-83","text":"12","element":"a"},{"text":".","element":"span"}],[{"text":"The (interior) covering number of the set ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/29-5.png","element":"img","alt":" CPTPn,G","inline":true,"padRight":true},{"text":"of quantum channels with gate complexity at most ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"is defined to be","element":"span"}],[{"style":{"width":"98%"},"width":1838,"height":48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/29-6.png","element":"img"}],[{"text":"The fact that quantum channels of bounded gate complexity also have bounded complexity in the sense of covering numbers has previously been observed in Refs. [","element":"span"},{"href":"#id-13","referenceIndex":25,"text":"25","element":"a"},{"text":", ","element":"span"},{"href":"#id-14","referenceIndex":32,"text":"32","element":"a"},{"text":", ","element":"span"},{"href":"#id-148","referenceIndex":67,"text":"67","element":"a"},{"text":", ","element":"span"},{"href":"#id-81","referenceIndex":68,"text":"68","element":"a"},{"text":"]. We recall this insight and its proof in the following lemma.","element":"span"}],[{"id":"id-149","style":{"fontWeight":"bold"},"text":"Lemma 18 ","element":"span"},{"text":"(Covering number bounds from gate complexity (see Refs. [","element":"span"},{"href":"#id-81","referenceIndex":68,"text":"68","element":"a"},{"text":", Theorem C.2] and [","element":"span"},{"href":"#id-14","referenceIndex":32,"text":"32","element":"a"},{"text":", Theorem 8]))","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":10.4},"width":61.47,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/29-7.png","element":"img","alt":" ε ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1). The (interior) ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/29-8.png","element":"img","alt":" ε","inline":true},{"text":"-covering number of ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/29-9.png","element":"img","alt":" CPTPn,G","inline":true,"padRight":true},{"text":"with respect to ","element":"span"},{"style":{"height":17.6},"width":64.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/29-10.png","element":"img","alt":" ∥·∥⋄","inline":true,"padRight":true},{"text":"is bounded as","element":"span"}],[{"style":{"width":"71%"},"width":1332,"height":116,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/29-11.png","element":"img"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":18.89},"width":124.46,"height":47.23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/29-12.png","element":"img","alt":" ε′ = εG","inline":true},{"text":". From Ref. [","element":"span"},{"href":"#id-81","referenceIndex":68,"text":"68","element":"a"},{"text":", Lemma C.2], we have that the covering number for the set of ","element":"span"},{"text":"two-qubit channels is","element":"span"}],[{"style":{"width":"71%"},"width":1344,"height":100,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/29-13.png","element":"img"}],[{"text":"If we consider the set of all possible channels that act on two out of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"qubits, which we denote by ","element":"span"},{"style":{"height":23.56},"width":160.06,"height":58.89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/29-14.png","element":"img","alt":"CPTP(n)2 ","inline":true,"padRight":true},{"text":", then the covering number of this set is given by","element":"span"}],[{"style":{"width":"69%"},"width":1302,"height":121,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/29-15.png","element":"img"}],[{"text":"with the binomial factor","element":"span"},{"style":{"height":20.11},"width":56.82,"height":50.28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/29-16.png","element":"img","alt":"�n2�","inline":true,"padRight":true},{"text":"coming from the fact that we allow the two-qubit channels to act on any pair of qubits.","element":"span"}],[{"text":"Let us now consider an arbitrary ","element":"span"},{"style":{"height":19.5},"width":275.39,"height":48.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-0.png","element":"img","alt":" N ∈ CPTPn,G","inline":true},{"text":". By definition, every such channel has the form ","element":"span"},{"style":{"height":16.7},"width":517.25,"height":41.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-1.png","element":"img","alt":" N = NG ◦ NG−1 ◦ · · · ◦ N1","inline":true},{"text":", where ","element":"span"},{"style":{"height":23.56},"width":270.52,"height":58.89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-2.png","element":"img","alt":" Ni ∈ CPTP(n)2","inline":true,"padRight":true},{"text":"for all ","element":"span"},{"style":{"height":17.6},"width":315.16,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-3.png","element":"img","alt":" i ∈ {1, 2, . . . , G}","inline":true},{"text":". By definition of ","element":"span"},{"style":{"height":23.56},"width":354.42,"height":58.89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-4.png","element":"img","alt":"N(CPTP(n)2 , ε′, ∥·∥⋄","inline":true},{"text":"), for every channel ","element":"span"},{"style":{"height":16.62},"width":47.8,"height":41.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-5.png","element":"img","alt":" Ni","inline":true},{"text":", we can find a corresponding ","element":"span"},{"style":{"height":16.62},"width":47.8,"height":41.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-6.png","element":"img","alt":" �Ni","inline":true,"padRight":true},{"text":"in an ","element":"span"},{"style":{"height":8},"width":29.35,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-7.png","element":"img","alt":" ε′","inline":true},{"text":"-covering net for ","element":"span"},{"style":{"height":23.56},"width":160.06,"height":58.89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-8.png","element":"img","alt":" CPTP(n)2","inline":true,"padRight":true},{"text":"such that ","element":"span"},{"style":{"height":18.4},"width":302.09,"height":46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-9.png","element":"img","alt":" ∥Ni − �Ni∥⋄ ≤ ε′","inline":true},{"text":". Then, the channel ","element":"span"},{"style":{"height":19.5},"width":734.46,"height":48.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-10.png","element":"img","alt":"�N := �NG ◦ �NG−1 ◦ · · · ◦ �N1 ∈ CPTPn,G","inline":true,"padRight":true},{"text":"satisfies","element":"span"}],[{"style":{"width":"69%"},"width":1304,"height":113,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-11.png","element":"img"}],[{"text":"where we made use of the subadditivity-under-composition property of the diamond norm; see, e.g., Ref. [","element":"span"},{"href":"#id-110","referenceIndex":101,"text":"101","element":"a"},{"text":", Proposition 3.48]. Therefore, if we let ","element":"span"},{"style":{"height":23.56},"width":443.74,"height":58.9,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-12.png","element":"img","alt":"�Nε′ ⊆ CPTP(n)2 be an ε′","inline":true},{"text":"-covering net for ","element":"span"},{"style":{"height":23.56},"width":174.23,"height":58.9,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-13.png","element":"img","alt":" CPTP(n)2 ,","inline":true,"padRight":true},{"text":"then the set","element":"span"}],[{"style":{"width":"70%"},"width":1317,"height":63,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-14.png","element":"img"}],[{"text":"is an ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-15.png","element":"img","alt":" ε","inline":true},{"text":"-covering net for ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-16.png","element":"img","alt":" CPTPn,G","inline":true},{"text":". Noting that ","element":"span"},{"style":{"height":19.53},"width":254.74,"height":48.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-17.png","element":"img","alt":" |Nε| = |�Nε′|G","inline":true},{"text":", we obtain ","element":"span"},{"style":{"height":18.7},"width":432.05,"height":46.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-18.png","element":"img","alt":" N(CPTPn,G, ε, ∥·∥⋄) ≤","inline":true},{"style":{"height":19.53},"width":242.76,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-19.png","element":"img","alt":"|Nε| ≤ |�Nε′|G","inline":true},{"text":", for every ","element":"span"},{"style":{"height":8},"width":29.35,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-20.png","element":"img","alt":" ε′","inline":true},{"text":"-covering net for ","element":"span"},{"style":{"height":23.56},"width":160.06,"height":58.89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-21.png","element":"img","alt":" CPTP(n)2","inline":true,"padRight":true},{"text":". As the covering number ","element":"span"},{"style":{"height":23.56},"width":354.43,"height":58.89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-22.png","element":"img","alt":" N(CPTP(n)2 , ε′, ∥·∥⋄","inline":true},{"text":") ","element":"span"},{"text":"is, by definition, the size of the smallest ","element":"span"},{"style":{"height":8},"width":29.35,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-23.png","element":"img","alt":" ε′","inline":true},{"text":"-covering net for ","element":"span"},{"style":{"height":23.56},"width":160.06,"height":58.89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-24.png","element":"img","alt":" CPTP(n)2 ","inline":true,"padRight":true},{"text":", we can conclude that","element":"span"}],[{"style":{"width":"83%"},"width":1562,"height":130,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-25.png","element":"img"}],[{"text":"as required. ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-26.png","element":"img","alt":"■","inline":true}],[{"text":"We now observe that Lemma ","element":"span"},{"href":"#id-149","text":"18 ","element":"a"},{"text":"immediately implies similar sequential covering number bounds. In fact, this can be seen by a reasoning analogous to how Ref. [","element":"span"},{"href":"#id-81","referenceIndex":68,"text":"68","element":"a"},{"text":"] went from covering w.r.t. a norm on the level of the channel to empirical covering.","element":"span"}],[{"id":"id-82","style":{"fontWeight":"bold"},"text":"Corollary 19 ","element":"span"},{"text":"(Sequential covering number bounds from gate complexity)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":17.02},"width":159.76,"height":42.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-27.png","element":"img","alt":" T ∈ N≥1","inline":true,"padRight":true},{"text":"and let ","element":"span"},{"style":{"height":17.6},"width":442.22,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-28.png","element":"img","alt":"ε ∈ (0, 1), p ≥ 1. Then,","inline":true}],[{"style":{"width":"70%"},"width":1317,"height":112,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-29.png","element":"img"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Because of Lemma ","element":"span"},{"href":"#id-149","text":"18 ","element":"a"},{"text":"and the monotonicity of sequential covering numbers w.r.t. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"p","element":"span"},{"text":", it suffices to show that ","element":"span"},{"style":{"height":18.7},"width":773.71,"height":46.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-30.png","element":"img","alt":" Nz(CPTPn,G, ε, ∞) ≤ N(CPTPn,G, ε, ∥·∥⋄","inline":true},{"text":") holds for any complete rooted binary tree ","element":"span"},{"style":{"fontWeight":"bold"},"text":"z ","element":"span"},{"text":"of depth ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T","element":"span"},{"text":". This follows immediately from the fact that, if ","element":"span"},{"style":{"height":15.2},"width":187.2,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-31.png","element":"img","alt":" N and �N","inline":true,"padRight":true},{"text":"are two quantum channels, then","element":"span"}],[{"style":{"width":"85%"},"width":1608,"height":217,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-32.png","element":"img"}],[{"text":"holds for any bipartite effect operator ","element":"span"},{"style":{"height":17.5},"width":103.43,"height":43.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-33.png","element":"img","alt":" MR,B","inline":true,"padRight":true},{"text":"and for any bipartite state ","element":"span"},{"style":{"height":13.1},"width":162.8,"height":32.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/30-34.png","element":"img","alt":" ρR,A. ■","inline":true}],[{"text":"We can now plug this sequential covering number bound into Theorem ","element":"span"},{"href":"#id-80","text":"11","element":"a"},{"text":". This leads to the ","element":"span"},{"id":"id-150","text":"following regret bound for online learning channels of bounded gate complexity:","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Theorem 20 ","element":"span"},{"text":"(Regret bound for online learning channels of bounded complexity)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":17.2},"width":240.32,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-0.png","element":"img","alt":" ℓ : [0, 1] → R","inline":true,"padRight":true},{"text":"be convex and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"L","element":"span"},{"text":"-Lipschitz. There exists an online learning strategy that, when presented sequentially with channel test operators ","element":"span"},{"style":{"height":26.85},"width":426.89,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-1.png","element":"img","alt":" E(t)A,B, t ∈ {1, 2, . . . , T}","inline":true},{"text":", and associated loss functions ","element":"span"},{"style":{"height":17.6},"width":332.23,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-2.png","element":"img","alt":" ℓt(·) = ℓ((·) − bt),","inline":true,"padRight":true},{"text":"outputs a sequence of hypothesis Choi matrices ","element":"span"},{"style":{"height":26.85},"width":322.59,"height":67.14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-3.png","element":"img","alt":" N(t)A,B ∈ CPTP′n,G ","inline":true,"padRight":true},{"text":"whose regret is bounded as","element":"span"}],[{"style":{"width":"93%"},"width":1748,"height":122,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-4.png","element":"img"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Combining Theorem ","element":"span"},{"href":"#id-80","text":"11 ","element":"a"},{"text":"and Corollary ","element":"span"},{"href":"#id-82","text":"19","element":"a"},{"text":", we obtain the regret bounds from gate complexity as","element":"span"}],[{"style":{"width":"82%"},"width":1539,"height":814,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-5.png","element":"img"}],[{"text":"where we have used the inequalities","element":"span"},{"style":{"height":20.04},"width":610.96,"height":50.1,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-6.png","element":"img","alt":"√a + b ≤ √a +√b ≤�2(a + b)","inline":true,"padRight":true},{"text":"for ","element":"span"},{"style":{"height":15.2},"width":108.1,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-7.png","element":"img","alt":" a, b ≥","inline":true,"padRight":true},{"text":"0 and the integral identity","element":"span"},{"style":{"height":19.59},"width":1070.75,"height":48.97,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-8.png","element":"img","alt":"� �log(1/x) dx = x�log(1/x) − (√π/2) · erf(�log(1/x)","inline":true},{"text":"), with the error function given as erf(","element":"span"},{"style":{"height":25.81},"width":522.18,"height":64.53,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-9.png","element":"img","alt":"x) = 2√π� x0 exp(−t2) dt. ■","inline":true}],[{"text":"Finally, to conclude our discussion of online learning channels with gate complexity ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":", we can combine Theorem ","element":"span"},{"href":"#id-150","text":"20 ","element":"a"},{"text":"with Lemma ","element":"span"},{"href":"#id-83","text":"12 ","element":"a"},{"text":"to obtain the following mistake bound.","element":"span"}],[{"id":"id-151","style":{"fontWeight":"bold"},"text":"Corollary 21 ","element":"span"},{"text":"(Mistake bound for online learning channels of bounded complexity)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":17.2},"width":182.23,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-10.png","element":"img","alt":" ε ∈ (0, 1).","inline":true,"padRight":true},{"text":"There exists an online learning strategy that, in a realizable setting for ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-11.png","element":"img","alt":" CPTPn,G","inline":true},{"text":", when presented sequentially with channel test operators ","element":"span"},{"style":{"height":26.85},"width":450.93,"height":67.12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-12.png","element":"img","alt":" E(t)A,B, t ∈ {1, 2, . . . , T}","inline":true},{"text":", and associated loss functions ","element":"span"},{"style":{"height":17.2},"width":300.83,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-13.png","element":"img","alt":"ℓt(·) = ℓ((·) − bt","inline":true},{"text":"), for some ","element":"span"},{"style":{"height":15.2},"width":241.01,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-14.png","element":"img","alt":" L-Lipschitz ℓ","inline":true},{"text":", outputs a sequence ","element":"span"},{"style":{"height":26.85},"width":322.59,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-15.png","element":"img","alt":" N(t)A,B ∈ CPTP′n,G","inline":true},{"text":", of hypothesis Choi ","element":"span"},{"text":"matrices that makes at most ","element":"span"},{"style":{"height":29.33},"width":632.2,"height":73.34,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-16.png","element":"img","alt":" O� L2G log(Gn)ε2 �many ε-mistakes.","inline":true}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Thanks to Theorem ","element":"span"},{"href":"#id-150","text":"20","element":"a"},{"text":", we can apply Lemma ","element":"span"},{"href":"#id-83","text":"12 ","element":"a"},{"text":"with ","element":"span"},{"style":{"height":19.21},"width":544.22,"height":48.02,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-17.png","element":"img","alt":" h1(ε, T) = CL�TG log(Gn)","inline":true,"padRight":true},{"text":"for a suitable constant ","element":"span"},{"style":{"fontStyle":"italic"},"text":"C > ","element":"span"},{"text":"0 and with ","element":"span"},{"style":{"height":17.6},"width":578.98,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-18.png","element":"img","alt":" h2(ε) = 0. With these choices,","inline":true}],[{"style":{"width":"83%"},"width":1558,"height":129,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/31-19.png","element":"img"}],[{"text":"So, the mistake bound obtained from Lemma ","element":"span"},{"href":"#id-83","text":"12 ","element":"a"},{"text":"in this case is exactly as claimed in the statement of the corollary. ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/32-0.png","element":"img","alt":"■","inline":true}],[{"text":"Together, Theorem ","element":"span"},{"href":"#id-150","text":"20 ","element":"a"},{"text":"and Corollary ","element":"span"},{"href":"#id-151","text":"21 ","element":"a"},{"text":"establish Theorem ","element":"span"},{"href":"#id-33","text":"1","element":"a"},{"text":". In particular, this implies: For the physically relevant class of channels implementable with polynomial-size circuits, we can solve the online learning task with only polynomially many mistakes.","element":"span"}],[{"id":"id-103","style":{"fontWeight":"bold"},"text":"3.2 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Regret bound for mixtures of known channels","element":"span"}],[{"text":"In the previous section, it was shown that online learning quantum channels of bounded gate complexity is possible with good regret and number of mistakes. Here we show that even if the channel has unbounded (exponentially many in the number of qubits) gates that act on the input state, regret- and mistake-bounded online learning is still possible if we know the gates but not the probability with which they act. Even more generally, we show that any channel composed of mixture of known channels is efficiently online learnable, even if the mixture is over exponentially many known channels, which could be arbitrary quantum channels. A notable example of such channels are mixed unitary channels (with known unitaries), for example Pauli channels. Since the Pauli channel framework is better understood and well-known, we first prove regret upper bounds in this setting for clearer exposition. Later we generalize this to mixtures of general known channels.","element":"span"}],[{"id":"id-155","style":{"fontWeight":"bold"},"text":"Theorem 22 ","element":"span"},{"text":"(Regret bound for online learning Pauli channels)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":0},"width":12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/32-1.png","element":"img","alt":" ℓ","inline":true,"padRight":true},{"text":": [0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1] ","element":"span"},{"style":{"height":12},"width":87.75,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/32-2.png","element":"img","alt":" → R","inline":true,"padRight":true},{"text":"be convex and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"L","element":"span"},{"text":"-Lipschitz. There exists an online learning strategy that, when presented sequentially with channel test operators ","element":"span"},{"style":{"height":26.85},"width":425.33,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/32-3.png","element":"img","alt":" E(t)A,B, t ∈ {1, 2, . . . , T}","inline":true},{"text":", and associated losses ","element":"span"},{"style":{"height":17.2},"width":295.57,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/32-4.png","element":"img","alt":" ℓt(·) = ℓ((·) − bt","inline":true},{"text":"), outputs Pauli channel ","element":"span"},{"text":"Choi matrix hypotheses ","element":"span"},{"style":{"height":26.85},"width":291.16,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/32-5.png","element":"img","alt":" N(t)A,B ∈ PAULI′n ","inline":true,"padRight":true},{"text":"whose regret is bounded as","element":"span"}],[{"style":{"width":"79%"},"width":1488,"height":122,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/32-6.png","element":"img"}],[{"text":"for every Pauli channel ","element":"span"},{"style":{"height":15.1},"width":324.47,"height":37.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/32-7.png","element":"img","alt":" PA→B ∈ PAULIn.","inline":true}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"The result follows by applying Lemma ","element":"span"},{"href":"#id-152","text":"9 ","element":"a"},{"text":"and applying the multiplicative weights update (MWU) algorithm (Algorithm ","element":"span"},{"href":"#id-134","text":"1","element":"a"},{"text":") with a particular choice of the loss vectors ","element":"span"},{"style":{"height":16.33},"width":83.46,"height":40.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/32-8.png","element":"img","alt":" m(t)","inline":true},{"text":". Specifically, we let","element":"span"}],[{"style":{"height":23.92},"width":1514.33,"height":59.8,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/32-9.png","element":"img","alt":"m(t) := (m(t)z,x)z,x∈{0,1}n, (3.14)","inline":true},{"style":{"height":35.38},"width":1518.48,"height":88.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/32-10.png","element":"img","alt":"m(t)z,x := 1Lℓ′t(Tr[E(t)A,BN(t)A,B])Tr[E(t)A,BΓz,xA,B], ∀ z, x ∈ {0, 1}n, (3.15)","inline":true}],[{"text":"where","element":"span"}],[{"style":{"width":"63%"},"width":1196,"height":98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/32-11.png","element":"img"}],[{"text":"with the probability vectors ","element":"span"},{"style":{"height":25.55},"width":409.48,"height":63.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/32-12.png","element":"img","alt":" p(t) := (p(t)z,x)z,x∈{0,1}n","inline":true,"padRight":true},{"text":"defined according to the MWU algorithm. Let ","element":"span"},{"text":"us verify that ","element":"span"},{"style":{"height":23.8},"width":223.21,"height":59.51,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/32-13.png","element":"img","alt":" m(t)z,x ∈ [−1,","inline":true,"padRight":true},{"text":"1] for all ","element":"span"},{"style":{"height":17.6},"width":255.22,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/32-14.png","element":"img","alt":" z, x ∈ {0, 1}n","inline":true},{"text":", as required by the MWU algorithm. Indeed, we ","element":"span"},{"text":"readily have that ","element":"span"},{"style":{"height":26.85},"width":491.26,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/33-0.png","element":"img","alt":"1Lℓ′t(Tr[E(t)A,BN(t)A,B]) ∈ [−1,","inline":true,"padRight":true},{"text":"1] because ","element":"span"},{"style":{"height":0},"width":12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/33-1.png","element":"img","alt":" ℓ","inline":true,"padRight":true},{"text":"is assumed to be ","element":"span"},{"style":{"fontStyle":"italic"},"text":"L","element":"span"},{"text":"-Lipschitz. In addition, ","element":"span"},{"text":"we have that","element":"span"}],[{"style":{"width":"87%"},"width":1646,"height":68,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/33-2.png","element":"img"}],[{"text":"where we have used the fact that, because ","element":"span"},{"style":{"height":26.85},"width":92.99,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/33-3.png","element":"img","alt":" E(t)A,B","inline":true,"padRight":true},{"text":"is a channel test operator, there exists a density ","element":"span"},{"text":"operator ","element":"span"},{"style":{"height":26.85},"width":572.04,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/33-4.png","element":"img","alt":" σA such that E(t)A,B ≤ σA ⊗ 1B","inline":true},{"text":". Therefore, ","element":"span"},{"style":{"height":23.8},"width":223.2,"height":59.51,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/33-5.png","element":"img","alt":" m(t)z,x ∈ [−1,","inline":true,"padRight":true},{"text":"1] for all ","element":"span"},{"style":{"height":17.6},"width":268.75,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/33-6.png","element":"img","alt":" z, x ∈ {0, 1}n.","inline":true}],[{"text":"Now, for every Pauli channel ","element":"span"},{"style":{"height":14.7},"width":115.6,"height":36.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/33-7.png","element":"img","alt":" PA→B","inline":true,"padRight":true},{"text":"with associated error-rate vector ","element":"span"},{"style":{"height":19.95},"width":359.1,"height":49.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/33-8.png","element":"img","alt":" q = (qz,x)z,x∈{0,1}n","inline":true},{"text":", we can use Equation (","element":"span"},{"href":"#id-153","text":"2.12","element":"a"},{"text":") to see that","element":"span"}],[{"style":{"width":"94%"},"width":1765,"height":311,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/33-9.png","element":"img"}],[{"text":"and similarly, we find that","element":"span"}],[{"style":{"width":"73%"},"width":1377,"height":89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/33-10.png","element":"img"}],[{"text":"Therefore, it follows from the known regret bound in Theorem ","element":"span"},{"href":"#id-154","text":"13 ","element":"a"},{"text":"on the MWU algorithm that","element":"span"}],[{"style":{"width":"88%"},"width":1656,"height":385,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/33-11.png","element":"img"}],[{"text":"Finally, combining this inequality with the result of Lemma ","element":"span"},{"href":"#id-152","text":"9","element":"a"},{"text":", we obtain","element":"span"}],[{"style":{"width":"81%"},"width":1529,"height":385,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/33-12.png","element":"img"}],[{"text":"for every Pauli channel ","element":"span"},{"style":{"height":15.1},"width":310.92,"height":37.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/33-13.png","element":"img","alt":" PA→B ∈ PAULIn","inline":true},{"text":". The claimed bound then follows by setting ","element":"span"},{"style":{"height":19.21},"width":277.68,"height":48.02,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/33-14.png","element":"img","alt":" η =�n/T. ■","inline":true}],[{"text":"While Theorem ","element":"span"},{"href":"#id-155","text":"22 ","element":"a"},{"text":"focused solely on Pauli channels, as we show below, it readily translates to ","element":"span"},{"style":{"fontStyle":"italic"},"text":"any ","element":"span"},{"text":"convex combination of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"known ","element":"span"},{"text":"channels—even exponentially many known channels that may have arbitrarily large gate complexity. This generalization captures a wide class of channels of interest, such as mixed unitary channels, in which each known channel is a unitary. Pauli channels are themselves a special case of such channels, because they are convex combinations of the 4","element":"span"},{"style":{"height":5.6},"width":21,"height":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/33-15.png","element":"img","alt":"n","inline":true,"padRight":true},{"id":"id-166","text":"unitary channels defined by the Pauli strings.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Corollary 23 ","element":"span"},{"text":"(Regret bound for convex combinations of known channels)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":15.42},"width":165.96,"height":38.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-0.png","element":"img","alt":" K ∈ Z>0","inline":true},{"text":", and let ","element":"span"},{"style":{"height":22.58},"width":155.84,"height":56.45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-1.png","element":"img","alt":"{Nj}Kj=1","inline":true,"padRight":true},{"text":"be a set of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"K ","element":"span"},{"text":"known ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit channels ","element":"span"},{"style":{"height":19.42},"width":248.84,"height":48.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-2.png","element":"img","alt":" Nj ∈ CPTPn","inline":true},{"text":". Let ","element":"span"},{"style":{"height":0},"width":12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-3.png","element":"img","alt":" ℓ","inline":true,"padRight":true},{"text":": [0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1] ","element":"span"},{"style":{"height":12},"width":91.68,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-4.png","element":"img","alt":" → R","inline":true,"padRight":true},{"text":"be convex and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"L","element":"span"},{"text":"-Lipschitz. There exists an online learning strategy that, when presented sequentially with channel test operators ","element":"span"},{"style":{"height":26.85},"width":426.75,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-5.png","element":"img","alt":" E(t)A,B, t ∈ {1, 2, . . . , T}","inline":true},{"text":", and associated losses ","element":"span"},{"style":{"height":17.2},"width":300.87,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-6.png","element":"img","alt":" ℓt(·) = ℓ((·) − bt","inline":true},{"text":"), outputs Choi matrix ","element":"span"},{"text":"hypotheses ","element":"span"},{"style":{"height":26.85},"width":95.84,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-7.png","element":"img","alt":" N(t)A,B ","inline":true,"padRight":true},{"text":"of the form","element":"span"}],[{"style":{"width":"100%"},"width":1874,"height":388,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-8.png","element":"img"}],[{"text":"for every ","element":"span"},{"style":{"height":22.58},"width":386.56,"height":56.45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-9.png","element":"img","alt":" N ∈ conv({Nj}Kj=1).","inline":true}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"The proof is analogous to that of Theorem ","element":"span"},{"href":"#id-155","text":"22","element":"a"},{"text":". In particular, we combine Lemma ","element":"span"},{"href":"#id-152","text":"9 ","element":"a"},{"text":"with the MWU algorithm (Algorithm ","element":"span"},{"href":"#id-134","text":"1","element":"a"},{"text":") applied to a particular choice of loss vectors ","element":"span"},{"style":{"height":16.34},"width":240.86,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-10.png","element":"img","alt":" m(t). We let","inline":true}],[{"style":{"height":26.57},"width":1539.1,"height":66.42,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-11.png","element":"img","alt":"m(t) := (m(t)j )j∈{1,2,...,K}, (3.26)","inline":true},{"style":{"height":35.38},"width":1532.36,"height":88.45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-12.png","element":"img","alt":"m(t)j := 1Lℓ′t(Tr[E(t)A,BN(t)A,B])Tr[E(t)A,BCNjA,B] ∀ j ∈ {1, 2, . . . , K}, (3.27)","inline":true}],[{"text":"where","element":"span"}],[{"style":{"width":"60%"},"width":1137,"height":120,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-13.png","element":"img"}],[{"text":"with the probability vectors ","element":"span"},{"style":{"height":26.57},"width":422.5,"height":66.42,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-14.png","element":"img","alt":" p(t) := (p(t)j )j∈{1,2,...,K} ","inline":true,"padRight":true},{"text":"defined according to the MWU algorithm. As in ","element":"span"},{"text":"the proof of Theorem ","element":"span"},{"href":"#id-155","text":"22","element":"a"},{"text":", it is straightforward to verify that ","element":"span"},{"style":{"height":26.57},"width":212.18,"height":66.42,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-15.png","element":"img","alt":" m(t)j ∈ [−1,","inline":true,"padRight":true},{"text":"1] for all ","element":"span"},{"style":{"height":17.6},"width":329.45,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-16.png","element":"img","alt":" j ∈ {1, 2, . . . , K},","inline":true,"padRight":true},{"text":"as required by the MWU algorithm.","element":"span"}],[{"text":"Now, consider an arbitrary channel ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":"in the convex hull of the channels ","element":"span"},{"style":{"height":19.42},"width":50.8,"height":48.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-17.png","element":"img","alt":" Nj","inline":true},{"text":", specified as ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":"= ","element":"span"},{"style":{"height":23.29},"width":197.96,"height":58.22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-18.png","element":"img","alt":"�kj=1 qjNj","inline":true,"padRight":true},{"text":"for some probability vector ","element":"span"},{"style":{"height":17.2},"width":342.56,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-19.png","element":"img","alt":" q = (q1, q2, . . . , qK","inline":true},{"text":"). By linearity of the Choi representation, ","element":"span"},{"text":"and using Theorem ","element":"span"},{"href":"#id-154","text":"13","element":"a"},{"text":", we obtain","element":"span"}],[{"style":{"width":"88%"},"width":1656,"height":385,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-20.png","element":"img"}],[{"text":"Therefore, by Lemma ","element":"span"},{"href":"#id-152","text":"9","element":"a"},{"text":", we obtain","element":"span"}],[{"style":{"width":"81%"},"width":1529,"height":122,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/34-21.png","element":"img"}],[{"style":{"width":"58%"},"width":1092,"height":240,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-0.png","element":"img"}],[{"text":"Setting ","element":"span"},{"style":{"height":19.21},"width":275.41,"height":48.02,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-1.png","element":"img","alt":" η =�log K/T","inline":true},{"text":", we obtain the desired regret bound. ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-2.png","element":"img","alt":"■","inline":true}],[{"id":"id-171","style":{"fontWeight":"bold"},"text":"Corollary 24 ","element":"span"},{"text":"(Mistake bounds for Pauli channels and convex combination of known channels)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":10.4},"width":61.49,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-3.png","element":"img","alt":" ε ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1). Let ","element":"span"},{"style":{"height":15.42},"width":165.99,"height":38.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-4.png","element":"img","alt":" K ∈ Z>0","inline":true},{"text":", and let ","element":"span"},{"style":{"height":22.58},"width":155.84,"height":56.45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-5.png","element":"img","alt":" {Nj}Kj=1","inline":true,"padRight":true},{"text":"be a set of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"K ","element":"span"},{"text":"known ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit channels ","element":"span"},{"style":{"height":19.42},"width":241.05,"height":48.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-6.png","element":"img","alt":" Nj ∈ CPTPn","inline":true},{"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":22.58},"width":445.17,"height":56.45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-7.png","element":"img","alt":" NA→B ∈ conv({Nj}Kj=1","inline":true},{"text":") be a fixed channel unknown to the learner. There exists an online ","element":"span"},{"text":"learning strategy that, in a realizable setting for conv(","element":"span"},{"style":{"height":22.58},"width":155.84,"height":56.45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-8.png","element":"img","alt":"{Nj}Kj=1","inline":true},{"text":"), when presented sequentially with ","element":"span"},{"text":"two-outcome channel test operators ","element":"span"},{"style":{"height":26.85},"width":426.55,"height":67.14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-9.png","element":"img","alt":" E(t)A,B, t ∈ {1, 2, . . . , T}","inline":true},{"text":", and associated losses ","element":"span"},{"style":{"height":17.2},"width":331.35,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-10.png","element":"img","alt":" ℓt(·) = ℓ((·) − bt),","inline":true,"padRight":true},{"text":"for some ","element":"span"},{"style":{"fontStyle":"italic"},"text":"L","element":"span"},{"text":"-Lipschitz ","element":"span"},{"style":{"height":0},"width":12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-11.png","element":"img","alt":" ℓ","inline":true},{"text":", outputs a sequence ","element":"span"},{"style":{"height":27.87},"width":454.95,"height":69.67,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-12.png","element":"img","alt":" N(t)A,B ∈ conv({CNjA,B}Kj=1","inline":true},{"text":") of hypothesis Choi matrices ","element":"span"},{"text":"that makes at most ","element":"span"},{"style":{"height":25.8},"width":568.75,"height":64.49,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-13.png","element":"img","alt":" O(L2 log(K)ε2 ) many ε-mistakes.","inline":true}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Mimicking the proof of Corollary ","element":"span"},{"href":"#id-151","text":"21 ","element":"a"},{"text":"by invoking Lemma ","element":"span"},{"href":"#id-83","text":"12","element":"a"},{"text":", but now for the choice ","element":"span"},{"style":{"height":17.2},"width":193.74,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-14.png","element":"img","alt":" h1(ε, T) =","inline":true},{"style":{"height":17.89},"width":242.79,"height":44.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-15.png","element":"img","alt":"CL√T log K","inline":true,"padRight":true},{"text":"for a suitable ","element":"span"},{"style":{"height":28.8},"width":940.75,"height":72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-16.png","element":"img","alt":" C > 0 and h2(ε) = 0, we get that T ∗ =�3CL√log K2ε","inline":true}],[{"id":"id-60","style":{"fontWeight":"bold"},"text":"3.3 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Regret bounds for multi-time processes","element":"span"}],[{"text":"In this section, we generalize the developments of Sections ","element":"span"},{"href":"#id-102","text":"3.1 ","element":"a"},{"text":"and ","element":"span"},{"href":"#id-103","text":"3.2 ","element":"a"},{"text":"to multi-time processes. We start by presenting the problem of online learning classes of multi-time processes, by casting it in terms of the general framework of online learning laid out in Section ","element":"span"},{"href":"#id-100","text":"2.2","element":"a"},{"text":". With that, we formally state the problem in Problem ","element":"span"},{"href":"#id-156","text":"2 ","element":"a"},{"text":"below.","element":"span"}],[{"text":"Let ","element":"span"},{"style":{"height":12.8},"width":106.23,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-17.png","element":"img","alt":" r ∈ N","inline":true},{"text":". We fix finite-dimensional Hilbert spaces ","element":"span"},{"style":{"height":16.47},"width":454.82,"height":41.18,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-18.png","element":"img","alt":" HA1, HB1, . . . , HAr, HBr","inline":true},{"text":", and we let ","element":"span"},{"style":{"height":26.85},"width":146.87,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-19.png","element":"img","alt":" H(r)A,B ≡","inline":true},{"style":{"height":16.47},"width":851.54,"height":41.18,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-20.png","element":"img","alt":"HA1 ⊗ HB1 ⊗ HA2 ⊗ HB2 ⊗ · · · ⊗ HAr ⊗ HBr","inline":true},{"text":". Then, the input set/domain ","element":"span"},{"text":"X ","element":"span"},{"text":"comprises multi-time test operators corresponding to two-outcome multi-time tests","element":"span"}],[{"style":{"width":"70%"},"width":1329,"height":73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-21.png","element":"img"}],[{"text":"The output set is ","element":"span"},{"text":"Y ","element":"span"},{"text":"= [0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1], and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"F ","element":"span"},{"text":"is defined via a subclass ","element":"span"},{"style":{"height":15.02},"width":228.6,"height":37.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-22.png","element":"img","alt":" C ⊆ COMBr","inline":true,"padRight":true},{"text":"of interest, such that for every ","element":"span"},{"style":{"height":13.2},"width":121.15,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-23.png","element":"img","alt":" N ∈ C","inline":true,"padRight":true},{"text":"we define the function ","element":"span"},{"style":{"height":17.6},"width":868.68,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-24.png","element":"img","alt":" fN : X → Y as fN(E) = Tr[EN] for all E ∈ X","inline":true},{"text":". In other words,","element":"span"}],[{"style":{"width":"79%"},"width":1490,"height":73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-25.png","element":"img"}],[{"text":"As in the case of channels, online learning proceeds is an interactive procedure with an adversary, who provides test operators ","element":"span"},{"style":{"height":26.85},"width":270.06,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-26.png","element":"img","alt":" E(t) ∈ L(H(r)A,B","inline":true},{"text":") for two-output multi-time tests to the learner, and the ","element":"span"},{"text":"learner outputs hypotheses ","element":"span"},{"style":{"height":16.73},"width":161.72,"height":41.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-27.png","element":"img","alt":" N(t) ∈ C","inline":true},{"text":". The goal of the learner is to achieve a small regret,","element":"span"}],[{"id":"id-157","style":{"width":"75%"},"width":1422,"height":122,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/35-28.png","element":"img"}],[{"id":"id-160","style":{"width":"84%"},"width":1592,"height":459,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-0.png","element":"img"}],[{"text":"Figure 3: ","element":"figcaption","subtype":"caption"},{"style":{"fontWeight":"bold"},"text":"Multi-time processes with bounded complexity. ","element":"figcaption","subtype":"caption"},{"text":"(a) The basic unit of our multi-time processes with bounded complexity is a process consisting of two two-qubit channels connected by an inaccessible memory system. (b) By collapsing the causal structure of the inputs and outputs of the process in (a), we obtain a three-qubit channel belonging to the set ","element":"figcaption","subtype":"caption"},{"style":{"height":15.99},"width":144.85,"height":39.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-1.png","element":"img","alt":"CPTP3,2","inline":true},{"text":". (c) An example of a multi-time process obtained by composing (ten of) the basic elements in (a) in a circuit.","element":"figcaption","subtype":"caption"}],[{"text":"where the losses ","element":"span"},{"style":{"height":8},"width":30.19,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-2.png","element":"img","alt":" ℓt","inline":true,"padRight":true},{"text":"are revealed by the adversary. We aim for regret bounds scaling as ","element":"span"},{"style":{"height":19.21},"width":326.7,"height":48.02,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-3.png","element":"img","alt":" O(�T · poly(n)).","inline":true,"padRight":true},{"text":"On the other hand, in the realizable scenario, where there exists a (to the learner unknown) comb operator ","element":"span"},{"style":{"height":13.2},"width":158.14,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-4.png","element":"img","alt":" N∗ ∈ C","inline":true,"padRight":true},{"text":"such that all losses take the form ","element":"span"},{"style":{"height":17.6},"width":329.59,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-5.png","element":"img","alt":" ℓt(·) = ℓ((·) − bt","inline":true},{"text":"), where ","element":"span"},{"style":{"height":0},"width":12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-6.png","element":"img","alt":" ℓ","inline":true,"padRight":true},{"text":"is some fixed function and each ","element":"span"},{"style":{"height":20.33},"width":619.39,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-7.png","element":"img","alt":" bt satisfies |bt − Tr[E(t)N∗]| ≤ ε/","inline":true},{"text":"3, the learner aims to achieve a small number of ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-8.png","element":"img","alt":"ε","inline":true},{"text":"-mistakes. Here, our goal is to guarantee mistake bounds scaling as ","element":"span"},{"style":{"height":19.14},"width":260.92,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-9.png","element":"img","alt":" O(poly(n, ε−1","inline":true},{"text":")). We summarize the formulation of our multi-time process online learning problem as follows.","element":"span"}],[{"id":"id-156","style":{"fontWeight":"bold"},"text":"Problem 2 ","element":"span"},{"text":"(Online learning of multi-time quantum processes)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Consider a subset ","element":"span"},{"style":{"height":15.02},"width":230.96,"height":37.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-10.png","element":"img","alt":" C ⊆ COMBr","inline":true},{"text":", for ","element":"span"},{"style":{"height":12.8},"width":109.43,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-11.png","element":"img","alt":" r ∈ N","inline":true},{"text":", and let ","element":"span"},{"style":{"height":13.2},"width":143.28,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-12.png","element":"img","alt":" N∗ ∈ C","inline":true,"padRight":true},{"text":"be unknown. Given a sequence of ","element":"span"},{"style":{"height":12.8},"width":120.09,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-13.png","element":"img","alt":" T ∈ N","inline":true,"padRight":true},{"text":"interactive rounds, in which test operators ","element":"span"},{"style":{"height":19.14},"width":446.19,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-14.png","element":"img","alt":" E(1), E(2), . . . , E(T) ∈ X","inline":true,"padRight":true},{"text":"are presented sequentially by an adversary, the problem is to output a sequence ","element":"span"},{"style":{"height":19.14},"width":463.5,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-15.png","element":"img","alt":" N(1), N(2), . . . , N(T) ∈ C","inline":true,"padRight":true},{"text":"of comb operators such that for losses ","element":"span"},{"style":{"height":8},"width":30.18,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-16.png","element":"img","alt":" ℓt","inline":true,"padRight":true},{"text":"of the form ","element":"span"},{"style":{"height":17.6},"width":304.3,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-17.png","element":"img","alt":"ℓt(·) = ℓ((·) − bt","inline":true},{"text":"), where ","element":"span"},{"style":{"height":0},"width":12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-18.png","element":"img","alt":" ℓ","inline":true,"padRight":true},{"text":"is some fixed function and each ","element":"span"},{"style":{"height":14.62},"width":30.73,"height":36.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-19.png","element":"img","alt":" bt","inline":true,"padRight":true},{"text":"satisfies ","element":"span"},{"style":{"height":20.33},"width":415.01,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-20.png","element":"img","alt":" |bt − Tr[E(t)N∗]| ≤ ε/","inline":true},{"text":"3, the regret in (","element":"span"},{"href":"#id-157","text":"3.33","element":"a"},{"text":") scales as ","element":"span"},{"style":{"height":19.21},"width":281.24,"height":48.02,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-21.png","element":"img","alt":" O(�T · poly(n","inline":true},{"text":")) and the number of ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-22.png","element":"img","alt":" ε","inline":true},{"text":"-mistakes scales as ","element":"span"},{"style":{"height":19.13},"width":309.46,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-23.png","element":"img","alt":" O(poly(n, ε−1)).","inline":true}],[{"text":"We note that for ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r ","element":"span"},{"text":"= 1, Problem ","element":"span"},{"href":"#id-156","text":"2 ","element":"a"},{"text":"is equivalent to Problem ","element":"span"},{"href":"#id-143","text":"1","element":"a"},{"text":". We also note that the projected MMW algorithm for Choi states of quantum channels (Algorithm ","element":"span"},{"href":"#id-144","text":"3","element":"a"},{"text":") generalizes straightforwardly to Choi states of multi-time processes; see Algorithm ","element":"span"},{"href":"#id-158","text":"4","element":"a"},{"text":".","element":"span"}],[{"id":"id-158","style":{"width":"99%"},"width":1873,"height":472,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/36-24.png","element":"img"}],[{"text":"We provide an analysis of Algorithm ","element":"span"},{"href":"#id-158","text":"4 ","element":"a"},{"text":"in Appendix ","element":"span"},{"href":"#id-109","text":"D.2","element":"a"},{"text":"; see Remark ","element":"span"},{"href":"#id-159","text":"64 ","element":"a"},{"text":"therein.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Multi-time processes with bounded gate complexity. ","element":"span"},{"text":"We can extend the results above to the case of multi-time processes with bounded complexity. The generalized form of multi-time processes allows for many possibilities for how to define multi-time processes with bounded complexity. Just as we defined ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit quantum channels with gate complexity ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"as being composed of at most ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"two-qubit quantum channels, here we consider multi-time processes composed of the basic “unit” shown in Figure ","element":"span"},{"href":"#id-160","text":"3","element":"a"},{"text":"(a). Specifically, we define ","element":"span"},{"style":{"height":17.9},"width":184.24,"height":44.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-0.png","element":"img","alt":" COMBn,G","inline":true,"padRight":true},{"text":"to be the set of all comb operators corresponding to multi-time processes on ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"qubits that can be implemented by the composition of at most ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"of the basic units in Figure ","element":"span"},{"href":"#id-160","text":"3","element":"a"},{"text":"(a), in the manner of a circuit as shown in Figure ","element":"span"},{"href":"#id-160","text":"3","element":"a"},{"text":"(c). The number of time steps, ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r","element":"span"},{"text":", in the multi-time process depends on ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":".","element":"span"}],[{"text":"First, we prove an analogue of Lemma ","element":"span"},{"href":"#id-149","text":"18 ","element":"a"},{"text":"for multi-time processes with gate complexity ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":". While the diamond norm was the natural distance measure in the case of channels, here we use strategy norms instead (see Appendix ","element":"span"},{"text":"B ","element":"span"},{"text":"for a definition and a discussion of their properties).","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Lemma 25 ","element":"span"},{"text":"(Covering number bounds from gate complexity for multi-time processes)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":17.2},"width":181.1,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-1.png","element":"img","alt":" ε ∈ (0, 1).","inline":true,"padRight":true},{"text":"The (interior) ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-2.png","element":"img","alt":" ε","inline":true},{"text":"-covering number of ","element":"span"},{"style":{"height":17.9},"width":184.24,"height":44.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-3.png","element":"img","alt":" COMBn,G","inline":true,"padRight":true},{"text":"with respect to ","element":"span"},{"style":{"height":17.6},"width":87.69,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-4.png","element":"img","alt":" ∥·∥⋄r","inline":true,"padRight":true},{"text":"is bounded as","element":"span"}],[{"style":{"width":"72%"},"width":1356,"height":130,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-5.png","element":"img"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"The proof again follows similar ideas as employed in the covering number bounds of [","element":"span"},{"href":"#id-14","referenceIndex":32,"text":"32","element":"a"},{"text":", ","element":"span"},{"href":"#id-81","referenceIndex":68,"text":"68","element":"a"},{"text":"]. First, we recall (compare, e.g., Ref. [","element":"span"},{"href":"#id-161","referenceIndex":118,"text":"118","element":"a"},{"text":"]) a well-known fact about covering numbers of norm balls in ","element":"span"},{"style":{"height":24.45},"width":724.57,"height":61.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-6.png","element":"img","alt":" RK: If R > 0, ε ∈ (0, R], and if B∥·∥R (x","inline":true},{"text":") denotes the ","element":"span"},{"style":{"height":17.6},"width":55.95,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-7.png","element":"img","alt":" ∥·∥","inline":true},{"text":"-ball of radius ","element":"span"},{"style":{"height":15.94},"width":507.96,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-8.png","element":"img","alt":" R around x ∈ RK for some","inline":true,"padRight":true},{"text":"norm ","element":"span"},{"style":{"height":17.6},"width":167.08,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-9.png","element":"img","alt":" ∥·∥, then","inline":true}],[{"style":{"width":"72%"},"width":1365,"height":89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-10.png","element":"img"}],[{"text":"To apply this in our scenario, notice that any basic unit as in Figure ","element":"span"},{"href":"#id-160","text":"3","element":"a"},{"text":"(a) lives in the ","element":"span"},{"style":{"height":17.6},"width":89.69,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-11.png","element":"img","alt":" ∥·∥⋄2","inline":true},{"text":"-unitball in a ((2 ","element":"span"},{"style":{"height":5.6},"width":12,"height":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-12.png","element":"img","alt":" ·","inline":true,"padRight":true},{"text":"(2","element":"span"},{"style":{"height":19.13},"width":1569.99,"height":47.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-13.png","element":"img","alt":"4 × 24)) = 512)-dimensional complex space, where the ambient dimension is that","inline":true,"padRight":true},{"text":"of two 2-qubit channels. Consequently, via the approximate monotonicity of covering numbers w.r.t. inclusion of sets, the above standard bound implies for our case that ","element":"span"},{"style":{"height":15.02},"width":144.28,"height":37.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-14.png","element":"img","alt":" COMB2","inline":true},{"text":", the class of basic units, admits","element":"span"}],[{"style":{"width":"79%"},"width":1481,"height":105,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-15.png","element":"img"}],[{"text":"as a covering number bound. If we let ","element":"span"},{"style":{"height":23.56},"width":174,"height":58.9,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-16.png","element":"img","alt":" COMB(n)2","inline":true,"padRight":true},{"text":"be the set of all basic units in Figure ","element":"span"},{"href":"#id-160","text":"3","element":"a"},{"text":"(a) that act on two out of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"qubits, then the covering number of this set is bounded from above as follows:","element":"span"}],[{"style":{"width":"69%"},"width":1303,"height":120,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-17.png","element":"img"}],[{"text":"with the binomial factor","element":"span"},{"style":{"height":20.11},"width":56.82,"height":50.27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-18.png","element":"img","alt":"�n2�","inline":true,"padRight":true},{"text":"coming from the fact that we allow the basic units to act on any pair of qubits.","element":"span"}],[{"text":"Now, consider an arbitrary ","element":"span"},{"style":{"height":17.9},"width":277.39,"height":44.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-19.png","element":"img","alt":" N ∈ COMBn,G","inline":true},{"text":". By definition, every such comb operator has the form ","element":"span"},{"style":{"height":14.7},"width":445.9,"height":36.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-20.png","element":"img","alt":"N = N1 ⋆ N2 ⋆ · · · ⋆ NG","inline":true},{"text":", where ","element":"span"},{"style":{"height":23.55},"width":277.12,"height":58.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-21.png","element":"img","alt":" Ni ∈ COMB(n)2","inline":true,"padRight":true},{"text":". Let ","element":"span"},{"style":{"height":18.89},"width":123.37,"height":47.23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/37-22.png","element":"img","alt":" ε′ = εG","inline":true},{"text":". By definition of the covering number ","element":"span"},{"style":{"height":23.56},"width":393.3,"height":58.89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-0.png","element":"img","alt":"N(COMB(n)2 , ε′, ∥·∥⋄2","inline":true},{"text":"), for every ","element":"span"},{"style":{"height":23.56},"width":276.38,"height":58.89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-1.png","element":"img","alt":" Ni ∈ COMB(n)2","inline":true,"padRight":true},{"text":", we can find a corresponding ","element":"span"},{"style":{"height":14.62},"width":47.06,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-2.png","element":"img","alt":" �Ni","inline":true,"padRight":true},{"text":"in an ","element":"span"},{"style":{"height":8},"width":29.35,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-3.png","element":"img","alt":" ε′","inline":true},{"text":"-covering ","element":"span"},{"text":"net for ","element":"span"},{"style":{"height":23.56},"width":686.31,"height":58.89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-4.png","element":"img","alt":" COMB(n)2 such that ∥Ni − �Ni∥⋄2 ≤ ε′","inline":true},{"text":". Then, if we let ","element":"span"},{"style":{"height":14.7},"width":431.23,"height":36.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-5.png","element":"img","alt":"�N := �N1 ⋆ �N2 ⋆· · ·⋆ �NG","inline":true},{"text":", and by making use of the subadditivity-under-composition property of the strategy norm, as shown in Corollary ","element":"span"},{"href":"#id-162","text":"57","element":"a"},{"text":", we obtain","element":"span"}],[{"style":{"width":"70%"},"width":1318,"height":113,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-6.png","element":"img"}],[{"text":"Therefore, if we let ","element":"span"},{"style":{"height":23.56},"width":457.52,"height":58.89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-7.png","element":"img","alt":"�Nε′ ⊆ COMB(n)2 be an ε′","inline":true},{"text":"-covering net for ","element":"span"},{"style":{"height":23.56},"width":174,"height":58.89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-8.png","element":"img","alt":" COMB(n)2 ","inline":true,"padRight":true},{"text":", then the set","element":"span"}],[{"style":{"width":"68%"},"width":1290,"height":73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-9.png","element":"img"}],[{"text":"is an ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-10.png","element":"img","alt":" ε","inline":true},{"text":"-covering net for ","element":"span"},{"style":{"height":17.9},"width":184.24,"height":44.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-11.png","element":"img","alt":" COMBn,G","inline":true},{"text":". Noting that ","element":"span"},{"style":{"height":19.53},"width":246.69,"height":48.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-12.png","element":"img","alt":" |Nε| = |�Nε′|G","inline":true},{"text":", we obtain ","element":"span"},{"style":{"height":18.7},"width":458.18,"height":46.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-13.png","element":"img","alt":" N(COMBn,G, ε, ∥·∥⋄r) ≤","inline":true},{"style":{"height":19.53},"width":466.71,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-14.png","element":"img","alt":"|Nε| ≤ |�Nε′|G for every ε′","inline":true},{"text":"-covering net for ","element":"span"},{"style":{"height":23.56},"width":174,"height":58.89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-15.png","element":"img","alt":" COMB(n)2 ","inline":true,"padRight":true},{"text":". As the covering number ","element":"span"},{"style":{"height":23.56},"width":411.71,"height":58.89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-16.png","element":"img","alt":" N(COMB(n)2 , ε′, ∥·∥⋄2)","inline":true,"padRight":true},{"text":"is, by definition, the size of the smallest ","element":"span"},{"style":{"height":8},"width":29.35,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-17.png","element":"img","alt":" ε′","inline":true},{"text":"-covering net for ","element":"span"},{"style":{"height":23.56},"width":174,"height":58.89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-18.png","element":"img","alt":" COMB(n)2 ","inline":true,"padRight":true},{"text":", we can conclude that","element":"span"}],[{"style":{"width":"85%"},"width":1601,"height":130,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-19.png","element":"img"}],[{"text":"as required. ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-20.png","element":"img","alt":"■","inline":true}],[{"id":"id-165","style":{"fontWeight":"bold"},"text":"Theorem 26 ","element":"span"},{"text":"(Regret bound for online learning multi-time processes of bounded complexity)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":0},"width":12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-21.png","element":"img","alt":"ℓ","inline":true,"padRight":true},{"text":": [0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1] ","element":"span"},{"style":{"height":12},"width":93.75,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-22.png","element":"img","alt":" → R","inline":true,"padRight":true},{"text":"be convex and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"L","element":"span"},{"text":"-Lipschitz. There exists an online learning strategy that, when presented sequentially with multi-time test operators ","element":"span"},{"style":{"height":15.94},"width":73.13,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-23.png","element":"img","alt":" E(t)","inline":true,"padRight":true},{"text":"for ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r ","element":"span"},{"text":"time steps, ","element":"span"},{"style":{"height":17.6},"width":306.3,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-24.png","element":"img","alt":" t ∈ {1, 2, . . . , T}","inline":true},{"text":", and associated loss functions ","element":"span"},{"style":{"height":17.6},"width":329.59,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-25.png","element":"img","alt":" ℓt(·) = ℓ((·) − bt","inline":true},{"text":"), outputs a sequence of hypothesis comb operators ","element":"span"},{"style":{"height":26.85},"width":336.58,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-26.png","element":"img","alt":"N(t)A,B ∈ COMB′n,G","inline":true},{"text":", corresponding to ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r","element":"span"},{"text":"-step multi-time quantum processes, whose regret is bounded ","element":"span"},{"text":"as","element":"span"}],[{"style":{"width":"87%"},"width":1640,"height":113,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-27.png","element":"img"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"The proof is analogous to the proof of Theorem ","element":"span"},{"href":"#id-150","text":"20","element":"a"},{"text":". First, it holds that","element":"span"}],[{"id":"id-164","style":{"width":"70%"},"width":1328,"height":130,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-28.png","element":"img"}],[{"text":"for ","element":"span"},{"style":{"height":17.02},"width":292.78,"height":42.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-29.png","element":"img","alt":" T ∈ N≥1, ε ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1), and ","element":"span"},{"style":{"height":14.8},"width":79.47,"height":37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-30.png","element":"img","alt":" p ≥","inline":true,"padRight":true},{"text":"1. ","element":"span"},{"text":"This holds due to the fact that ","element":"span"},{"style":{"height":18.7},"width":413.26,"height":46.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-31.png","element":"img","alt":" Nz(COMBn,G, ε, p) ≤","inline":true},{"style":{"height":18.7},"width":825.1,"height":46.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-32.png","element":"img","alt":"Nz(COMBn,G, ε, ∞) ≤ N(COMBn,G, ε, ∥·∥⋄r","inline":true},{"text":"), where ","element":"span"},{"style":{"fontWeight":"bold"},"text":"z ","element":"span"},{"text":"is an arbitrary complete rooted binary tree of depth ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T","element":"span"},{"text":". The first of these inequalities is due to monotonicity of the sequential covering numbers with respect to ","element":"span"},{"style":{"fontStyle":"italic"},"text":"p","element":"span"},{"text":", and the second follows from the fact that","element":"span"}],[{"style":{"width":"77%"},"width":1448,"height":75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-33.png","element":"img"}],[{"text":"for arbitrary multi-time test operators ","element":"span"},{"style":{"fontStyle":"italic"},"text":"E ","element":"span"},{"text":"and arbitrary comb operators ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":"and ","element":"span"},{"style":{"height":12},"width":39,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-34.png","element":"img","alt":"�N","inline":true},{"text":", which holds because of the Hölder inequality for strategy norms in (","element":"span"},{"href":"#id-163","text":"2.32","element":"a"},{"text":") and the fact that ","element":"span"},{"style":{"height":17.6},"width":165.58,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-35.png","element":"img","alt":" ∥E∥∗⋄r ≤","inline":true,"padRight":true},{"text":"1, by ","element":"span"},{"text":"definition of a multi-time test operator. We have thus established (","element":"span"},{"href":"#id-164","text":"3.42","element":"a"},{"text":"). From here, an application of Theorem ","element":"span"},{"href":"#id-80","text":"11","element":"a"},{"text":", along with the reasoning in the proof of Theorem ","element":"span"},{"href":"#id-150","text":"20","element":"a"},{"text":", gives us the desired result. ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/38-36.png","element":"img","alt":"■","inline":true,"padRight":true},{"style":{"fontWeight":"bold"},"text":"Corollary 27 ","element":"span"},{"text":"(Mistake bound for online learning multi-time processes of bounded complexity)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":10.4},"width":61.7,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-0.png","element":"img","alt":" ε ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1). There exists an online learning strategy that, in a realizable setting for ","element":"span"},{"style":{"height":17.9},"width":184.24,"height":44.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-1.png","element":"img","alt":" COMBn,G","inline":true},{"text":", when presented sequentially with multi-time test operator ","element":"span"},{"style":{"height":20.33},"width":411.52,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-2.png","element":"img","alt":" E(t), t ∈ {1, 2, . . . , T}","inline":true},{"text":", and associated loss functions ","element":"span"},{"style":{"height":17.6},"width":312.21,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-3.png","element":"img","alt":" ℓt(·) = ℓ((·) − bt","inline":true},{"text":"), for some ","element":"span"},{"style":{"fontStyle":"italic"},"text":"L","element":"span"},{"text":"-Lipschitz ","element":"span"},{"style":{"height":0},"width":12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-4.png","element":"img","alt":" ℓ","inline":true},{"text":", outputs a sequence ","element":"span"},{"style":{"height":21.44},"width":323.62,"height":53.59,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-5.png","element":"img","alt":" N(t) ∈ COMBn,G","inline":true,"padRight":true},{"text":"of hypothesis comb operators that makes at most ","element":"span"},{"style":{"height":29.34},"width":624.93,"height":73.34,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-6.png","element":"img","alt":" O� L2G log(Gn)ε2 �many ε-mistakes.","inline":true}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"The proof follows the analogous arguments as in the proof of Corollary ","element":"span"},{"href":"#id-151","text":"21","element":"a"},{"text":", in which we make use of the regret bound from Theorem ","element":"span"},{"href":"#id-165","text":"26","element":"a"},{"text":". ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-7.png","element":"img","alt":"■","inline":true}],[{"style":{"fontWeight":"bold"},"text":"Convex mixtures of known multi-time processes. ","element":"span"},{"text":"We now show that Theorem ","element":"span"},{"href":"#id-155","text":"22 ","element":"a"},{"text":"and Corollary ","element":"span"},{"href":"#id-166","text":"23 ","element":"a"},{"text":"generalize straightforwardly to convex mixtures of arbitrary, known multi-time processes.","element":"span"}],[{"id":"id-170","style":{"fontWeight":"bold"},"text":"Theorem 28 ","element":"span"},{"text":"(Regret bound for online learning convex mixtures of multi-time processes)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":15.42},"width":176.71,"height":38.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-8.png","element":"img","alt":"K ∈ Z>0","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":12.8},"width":116.98,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-9.png","element":"img","alt":" r ∈ N","inline":true},{"text":". Consider a convex combination of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"K ","element":"span"},{"text":"known ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r","element":"span"},{"text":"-step multi-time processes, with Choi representations ","element":"span"},{"style":{"height":18.62},"width":598.45,"height":46.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-10.png","element":"img","alt":" Nj ∈ COMBr, j ∈ {1, 2, . . . , K}","inline":true},{"text":", such that ","element":"span"},{"style":{"height":23.29},"width":315.44,"height":58.22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-11.png","element":"img","alt":" N∗ = �Kj=1 qjNj","inline":true},{"text":", where the ","element":"span"},{"text":"unknown probability distribution is given by ","element":"span"},{"style":{"height":17.6},"width":354.24,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-12.png","element":"img","alt":" q = (q1, q2, . . . , qK","inline":true},{"text":"). Let ","element":"span"},{"style":{"height":0},"width":12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-13.png","element":"img","alt":" ℓ","inline":true,"padRight":true},{"text":": [0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1] ","element":"span"},{"style":{"height":12},"width":92.82,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-14.png","element":"img","alt":" → R","inline":true,"padRight":true},{"text":"be convex and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"L","element":"span"},{"text":"-Lipschitz. There exists an online learning strategy that, when presented sequentially with multi-time test operators ","element":"span"},{"style":{"height":20.34},"width":411.95,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-15.png","element":"img","alt":" E(t), t ∈ {1, 2, . . . , T}","inline":true},{"text":", and associated losses ","element":"span"},{"style":{"height":17.6},"width":308.79,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-16.png","element":"img","alt":" ℓt(·) = ℓ((·) − bt","inline":true},{"text":"), outputs hypotheses ","element":"span"},{"style":{"height":18.55},"width":276,"height":46.38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-17.png","element":"img","alt":" N(t) ∈ COMBr","inline":true,"padRight":true},{"text":"of the form","element":"span"}],[{"style":{"width":"100%"},"width":1874,"height":395,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-18.png","element":"img"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"We combine Lemma ","element":"span"},{"href":"#id-152","text":"9 ","element":"a"},{"text":"with the MWU algorithm (Algorithm ","element":"span"},{"href":"#id-134","text":"1","element":"a"},{"text":") applied to a particular choice of loss vectors ","element":"span"},{"style":{"height":16.33},"width":240.86,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-19.png","element":"img","alt":" m(t). We let","inline":true}],[{"style":{"width":"79%"},"width":1483,"height":169,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-20.png","element":"img"}],[{"text":"where","element":"span"}],[{"style":{"width":"100%"},"width":1874,"height":214,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-21.png","element":"img"}],[{"text":"Let us verify that ","element":"span"},{"style":{"height":26.57},"width":214.84,"height":66.42,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-22.png","element":"img","alt":" m(t)j ∈ [−1,","inline":true,"padRight":true},{"text":"1] for all ","element":"span"},{"style":{"height":17.6},"width":320.52,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-23.png","element":"img","alt":" j ∈ {1, 2, . . . , K}","inline":true},{"text":", as required by the MWU algorithm. First, we readily have that ","element":"span"},{"style":{"height":21.29},"width":58,"height":53.23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-24.png","element":"img","alt":"1Lℓ′t","inline":true},{"text":"(Tr[","element":"span"},{"style":{"height":20.33},"width":318.78,"height":50.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-25.png","element":"img","alt":"E(t)N(t)]) ∈ [−1,","inline":true,"padRight":true},{"text":"1], because ","element":"span"},{"style":{"height":0},"width":12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-26.png","element":"img","alt":" ℓ","inline":true,"padRight":true},{"text":"is assumed to be ","element":"span"},{"style":{"fontStyle":"italic"},"text":"L","element":"span"},{"text":"-Lipschitz. In ","element":"span"},{"text":"addition, we have that","element":"span"}],[{"style":{"width":"70%"},"width":1318,"height":56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/39-27.png","element":"img"}],[{"style":{"width":"29%"},"width":549,"height":600,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-0.png","element":"img"}],[{"text":"where the inequality is due to the fact that ","element":"span"},{"style":{"height":20.33},"width":210.31,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-1.png","element":"img","alt":" ∥E(t)∥∗⋄r ≤","inline":true,"padRight":true},{"text":"1, which means by (","element":"span"},{"href":"#id-167","text":"2.31","element":"a"},{"text":") that there ","element":"span"},{"text":"exists ","element":"span"},{"style":{"height":17.39},"width":238.78,"height":43.47,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-2.png","element":"img","alt":" S ∈ COMB∗r","inline":true,"padRight":true},{"text":"such that ","element":"span"},{"style":{"height":21.75},"width":298.59,"height":54.37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-3.png","element":"img","alt":" E(t) ≤ S ⊗ 1Br","inline":true},{"text":"; the chain of equalities holds because of the fact ","element":"span"},{"text":"that ","element":"span"},{"style":{"height":17.82},"width":257.44,"height":44.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-4.png","element":"img","alt":" Nj ∈ COMBr","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":17.39},"width":236.12,"height":43.48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-5.png","element":"img","alt":" S ∈ COMB∗r","inline":true},{"text":", such that there exists ","element":"span"},{"style":{"height":26.57},"width":412.05,"height":66.42,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-6.png","element":"img","alt":" N(1)j , N(2)j , . . . , N(r−1)j","inline":true,"padRight":true},{"text":"satisfying the constraints in (","element":"span"},{"href":"#id-168","text":"2.28","element":"a"},{"text":") and there exists ","element":"span"},{"style":{"height":19.13},"width":286.47,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-7.png","element":"img","alt":" S(1), . . . , S(r−1) ","inline":true,"padRight":true},{"text":"satisfying the constraints in (","element":"span"},{"href":"#id-169","text":"2.29","element":"a"},{"text":").","element":"span"}],[{"text":"Now, let ","element":"span"},{"style":{"height":23.29},"width":314.69,"height":58.22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-8.png","element":"img","alt":" N∗ = �kj=1 qjNj","inline":true,"padRight":true},{"text":"for some probability vector ","element":"span"},{"style":{"height":17.6},"width":343.9,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-9.png","element":"img","alt":" q = (q1, q2, . . . , qK","inline":true},{"text":"). Using Theorem ","element":"span"},{"href":"#id-154","text":"13","element":"a"},{"text":", ","element":"span"},{"text":"we obtain","element":"span"}],[{"style":{"width":"83%"},"width":1569,"height":385,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-10.png","element":"img"}],[{"text":"Therefore, by Lemma ","element":"span"},{"href":"#id-152","text":"9","element":"a"},{"text":", we obtain","element":"span"}],[{"style":{"width":"77%"},"width":1461,"height":385,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-11.png","element":"img"}],[{"text":"Setting ","element":"span"},{"style":{"height":19.21},"width":275.41,"height":48.02,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-12.png","element":"img","alt":" η =�log K/T","inline":true},{"text":", we obtain the desired regret bound. ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-13.png","element":"img","alt":"■","inline":true}],[{"style":{"fontWeight":"bold"},"text":"Corollary 29 ","element":"span"},{"text":"(Mistake bound for convex mixtures of multi-time processes)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":10.4},"width":62.69,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-14.png","element":"img","alt":" ε ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1). Let ","element":"span"},{"style":{"height":15.42},"width":176.71,"height":38.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-15.png","element":"img","alt":"K ∈ Z>0","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":12.8},"width":116.98,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-16.png","element":"img","alt":" r ∈ N","inline":true},{"text":". Consider a convex combination of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"K ","element":"span"},{"text":"known ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r","element":"span"},{"text":"-step multi-time processes, with Choi representations ","element":"span"},{"style":{"height":18.62},"width":598.45,"height":46.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-17.png","element":"img","alt":" Nj ∈ COMBr, j ∈ {1, 2, . . . , K}","inline":true},{"text":", such that ","element":"span"},{"style":{"height":23.29},"width":315.44,"height":58.23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-18.png","element":"img","alt":" N∗ = �Kj=1 qjNj","inline":true},{"text":", where the ","element":"span"},{"text":"unknown probability distribution is given by ","element":"span"},{"style":{"height":17.6},"width":344.01,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-19.png","element":"img","alt":" q = (q1, q2, . . . , qK","inline":true},{"text":"). There exists an online learning strategy that, in a realizable setting for conv(","element":"span"},{"style":{"height":22.58},"width":155.1,"height":56.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-20.png","element":"img","alt":"{Nj}Kj=1","inline":true},{"text":"), when presented sequentially with multi-time ","element":"span"},{"text":"test operator ","element":"span"},{"style":{"height":20.33},"width":427.84,"height":50.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-21.png","element":"img","alt":" E(t), t ∈ {1, 2, . . . , T}","inline":true},{"text":", and associated loss functions ","element":"span"},{"style":{"height":17.6},"width":324.96,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/40-22.png","element":"img","alt":" ℓt(·) = ℓ((·) − bt","inline":true},{"text":"), for some","element":"span"}],[{"style":{"height":15.2},"width":242.54,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/41-0.png","element":"img","alt":"L-Lipschitz ℓ","inline":true},{"text":", outputs a sequence ","element":"span"},{"style":{"height":21.44},"width":317.96,"height":53.59,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/41-1.png","element":"img","alt":" N(t) ∈ COMBn,G","inline":true,"padRight":true},{"text":"of hypothesis comb operators of the form","element":"span"}],[{"style":{"width":"58%"},"width":1102,"height":129,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/41-2.png","element":"img"}],[{"text":"with ","element":"span"},{"style":{"height":24.05},"width":591.76,"height":60.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/41-3.png","element":"img","alt":" p(t) := (p(t)1 , p(t)2 , . . . , p(t)K ) ∈ ∆K","inline":true},{"text":", that makes at most ","element":"span"},{"style":{"height":25.79},"width":568.75,"height":64.48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/41-4.png","element":"img","alt":" O(L2 log(K)ε2 ) many ε-mistakes.","inline":true}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Using Lemma ","element":"span"},{"href":"#id-83","text":"12","element":"a"},{"text":", we can derive this mistake bound from the regret bound of Theorem ","element":"span"},{"href":"#id-170","text":"28","element":"a"},{"text":", analogously to how Corollary ","element":"span"},{"href":"#id-171","text":"24 ","element":"a"},{"text":"followed from Corollary ","element":"span"},{"href":"#id-166","text":"23","element":"a"},{"text":". ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/41-5.png","element":"img","alt":"■","inline":true}],[{"id":"id-39","style":{"fontWeight":"bold"},"text":"3.4 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Learning-theoretic implications","element":"span"}],[{"text":"To conclude our discussion of regret and mistake upper bounds for online learning certain classes of quantum channels, we highlight some learning-theoretic implications of our bounds. On the one hand, we note that our regret bounds immediately give rise to bounds on a complexity measure called ","element":"span"},{"style":{"fontStyle":"italic"},"text":"sequential fat-shattering dimension ","element":"span"},{"text":"[","element":"span"},{"href":"#id-78","referenceIndex":82,"text":"82","element":"a"},{"text":"] of the respective channel classes. On the other hand, our mistake-bounded online learner can be used to construct sample compression schemes. For simplicity of presentation, we focus on Pauli channels in this discussion. However, these implications immediately extend to all the classes of channels and the multi-time quantum processes that we established regret and mistake bounds for. That is, also for these classes we obtain complexity bounds and (approximate) compression schemes.","element":"span"}],[{"text":"For the first implication, we rely on known results ([","element":"span"},{"href":"#id-78","referenceIndex":82,"text":"82","element":"a"},{"text":", Proposition 9] and [","element":"span"},{"href":"#id-79","referenceIndex":83,"text":"83","element":"a"},{"text":", Lemma 2]) to derive a sequential fat-shattering dimension bound from the regret bound established in Theorem ","element":"span"},{"href":"#id-155","text":"22","element":"a"},{"text":":","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Corollary 30 ","element":"span"},{"text":"(Sequential fat-shattering dimension bound)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":15.02},"width":138.88,"height":37.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/41-6.png","element":"img","alt":" PAULIn","inline":true,"padRight":true},{"text":"be the class of all ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit Pauli channels, and let ","element":"span"},{"style":{"height":17.6},"width":124.47,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/41-7.png","element":"img","alt":" ε ∈ (0,","inline":true,"padRight":true},{"text":"1). Then, sfat","element":"span"},{"style":{"height":28.8},"width":387.54,"height":72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/41-8.png","element":"img","alt":"ε(PAULIn) ≤ O�nε2�.","inline":true}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"We start from Theorem ","element":"span"},{"href":"#id-155","text":"22 ","element":"a"},{"text":"for the special case of ","element":"span"},{"style":{"height":17.6},"width":177.1,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/41-9.png","element":"img","alt":" ℓ(·) = | · |","inline":true},{"text":". This gives us a regret bound of ","element":"span"},{"style":{"height":19.94},"width":176.03,"height":49.85,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/41-10.png","element":"img","alt":"O(L√nT","inline":true},{"text":"). Combining this with the first inequality in Ref. [","element":"span"},{"href":"#id-78","referenceIndex":82,"text":"82","element":"a"},{"text":", Proposition 9] or with [","element":"span"},{"href":"#id-79","referenceIndex":83,"text":"83","element":"a"},{"text":", Lemma 2], we get","element":"span"}],[{"style":{"width":"74%"},"width":1395,"height":142,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/41-11.png","element":"img"}],[{"text":"We can now rearrange and, after plugging in ","element":"span"},{"style":{"height":17.6},"width":877.28,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/41-12.png","element":"img","alt":" T = 1 and using that clearly sfatε(PAULIn) ≥","inline":true,"padRight":true},{"text":"1, we get the claimed bound of sfat","element":"span"},{"style":{"height":28.8},"width":460.57,"height":72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/41-13.png","element":"img","alt":"ε(PAULIn) ≤ O�nε2�. ■","inline":true}],[{"text":"Via Ref. [","element":"span"},{"href":"#id-79","referenceIndex":83,"text":"83","element":"a"},{"text":", Corollary 1], this sequential fat-shattering dimension bounds also implies sequential covering number bounds. ","element":"span"},{"text":"While obtaining these complexity bounds from our regret bounds (Theorem ","element":"span"},{"href":"#id-155","text":"22","element":"a"},{"text":") is standard given prior work, we highlight that the obtained bounds are exponentially better than what would arise from naive parameter counting. Namely, while a Pauli channel is specified by a 4","element":"span"},{"style":{"height":5.6},"width":21,"height":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/41-14.png","element":"img","alt":"n","inline":true},{"text":"-dimensional probability vector, these complexity measure bounds show that the “effective” dimension relevant for online learning is only linear in ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":". Finally, we note that the sequential fat-shattering dimension and covering numbers are upper bounds on their non-sequential ","element":"span"},{"text":"counterparts. Thus, the above upper bounds immediately carry over to the complexity measures relevant for (agnostic) ","element":"span"},{"style":{"fontStyle":"italic"},"text":"probably approximately correct ","element":"span"},{"text":"(PAC) learning and lead to corresponding generalization bounds for PAC learning Pauli channels. In particular, this implies that Pauli channels are a restricted class of operations that allow for “pretty good process tomography”, as asked for in Ref. [","element":"span"},{"href":"#id-6","referenceIndex":15,"text":"15","element":"a"},{"text":", Section 4]. We note that Refs. [","element":"span"},{"href":"#id-6","referenceIndex":15,"text":"15","element":"a"},{"text":", ","element":"span"},{"href":"#id-9","referenceIndex":20,"text":"20","element":"a"},{"text":"] obtained (sequential) fat-shattering bounds for quantum states from bounds on quantum random access coding. It would be interesting to see whether this implication can be reversed in our setting: Do our (sequential) fat-shattering bounds imply limitations for encoding classical information into Pauli channels in a random access coding fashion?","element":"span"}],[{"text":"Next, we turn our attention to implications for compression. For the case of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"{","element":"span"},{"text":"0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1","element":"span"},{"style":{"fontStyle":"italic"},"text":"}","element":"span"},{"text":"-valued functions, the connection between mistake-bounded online learning and sample compression via the so-called one-pass compression scheme has already been observed in Ref. [","element":"span"},{"href":"#id-172","referenceIndex":119,"text":"119","element":"a"},{"text":", Section 4]. We notice that, with minor adaptations, this reasoning also applies to [0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1]-valued function classes when considering ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-0.png","element":"img","alt":" ε","inline":true},{"text":"-mistakes and uniformly ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-1.png","element":"img","alt":" ε","inline":true},{"text":"-approximate compression schemes (defined in Ref. [","element":"span"},{"href":"#id-173","referenceIndex":120,"text":"120","element":"a"},{"text":"]):","element":"span"}],[{"id":"id-174","style":{"fontWeight":"bold"},"text":"Lemma 31 ","element":"span"},{"text":"(Compression from mistake-driven online learning)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":14.4},"width":84.14,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-2.png","element":"img","alt":" F ⊆","inline":true,"padRight":true},{"text":"[0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1]","element":"span"},{"style":{"height":8.8},"width":23,"height":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-3.png","element":"img","alt":"X","inline":true},{"text":". Let ","element":"span"},{"style":{"height":10.4},"width":63.76,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-4.png","element":"img","alt":" ε ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1). Suppose ","element":"span"},{"text":"X ","element":"span"},{"text":"admits some total order and suppose ","element":"span"},{"style":{"fontStyle":"italic"},"text":"F ","element":"span"},{"text":"admits a mistake-driven online learner ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"that makes at most ","element":"span"},{"style":{"height":17.2},"width":378.6,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-5.png","element":"img","alt":" M = MF(ε) many ε","inline":true},{"text":"-mistakes when sequentially presented with challenges ","element":"span"},{"style":{"height":10.8},"width":195.78,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-6.png","element":"img","alt":" x1, . . . , xm","inline":true,"padRight":true},{"text":"and the corresponding values ","element":"span"},{"style":{"height":17.6},"width":328.29,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-7.png","element":"img","alt":" f∗(x1), . . . , f∗(xm","inline":true},{"text":") for some unknown ","element":"span"},{"style":{"height":15.6},"width":140.53,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-8.png","element":"img","alt":" f∗ ∈ F","inline":true},{"text":". Then, ","element":"span"},{"style":{"fontStyle":"italic"},"text":"F ","element":"span"},{"text":"admits a uniformly ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-9.png","element":"img","alt":" ε","inline":true},{"text":"-approximate sample compression scheme.","element":"span"}],[{"text":"The assumption that ","element":"span"},{"text":"X ","element":"span"},{"text":"admits a total order is trivially satisfied whenever ","element":"span"},{"text":"X ","element":"span"},{"text":"is finite, which is typically the case in computational learning theory. If we assume the ordering principle (i.e., that every set can be totally ordered), then we can apply Lemma ","element":"span"},{"href":"#id-174","text":"31 ","element":"a"},{"text":"for a general instance space ","element":"span"},{"text":"X","element":"span"},{"text":". We note that the ordering principle is strictly weaker than the well-ordering theorem [","element":"span"},{"href":"#id-175","referenceIndex":121,"text":"121","element":"a"},{"text":"–","element":"span"},{"href":"#id-176","referenceIndex":123,"text":"123","element":"a"},{"text":"] (see also [","element":"span"},{"href":"#id-177","referenceIndex":124,"text":"124","element":"a"},{"text":", Section 4.4]), which in turn is well known to be equivalent to the axiom of choice.","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"Proof of Lemma ","element":"span"},{"href":"#id-174","style":{"fontStyle":"italic"},"text":"31","element":"a"},{"style":{"fontStyle":"italic"},"text":". ","element":"span"},{"text":"We first describe the compression and reconstruction maps, then we prove that they have the desired compression scheme property. The compression map ","element":"span"},{"style":{"height":19.6},"width":458.96,"height":49,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-10.png","element":"img","alt":" κ : �m≥1(X × [0, 1])m →","inline":true},{"style":{"height":20},"width":261.51,"height":50.01,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-11.png","element":"img","alt":"�1≤m≤M(X ×","inline":true,"padRight":true},{"text":"[0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1])","element":"span"},{"style":{"height":5.6},"width":30,"height":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-12.png","element":"img","alt":"m","inline":true,"padRight":true},{"text":"is defined as follows: Given a dataset ","element":"span"},{"style":{"height":14.8},"width":161.13,"height":37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-13.png","element":"img","alt":" S ⊆ X ×","inline":true,"padRight":true},{"text":"[0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1], reorder ","element":"span"},{"style":{"fontStyle":"italic"},"text":"S ","element":"span"},{"text":"according to ","element":"span"},{"text":"the total order on ","element":"span"},{"text":"X","element":"span"},{"text":", run the online learner ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"with an adversary that sequentially presents the learner with the reordered elements of ","element":"span"},{"style":{"height":17.2},"width":272.85,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-14.png","element":"img","alt":" S, and let κ(S","inline":true},{"text":") be the set of (labeled) examples on which ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"made an ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-15.png","element":"img","alt":" ε","inline":true},{"text":"-mistake. Next, we define the reconstruction map ","element":"span"},{"style":{"height":21.94},"width":736.22,"height":54.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-16.png","element":"img","alt":" ρ : �1≤m≤M(X × [0, 1])m → [0, 1]X. Let","inline":true},{"style":{"height":13.2},"width":163.5,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-17.png","element":"img","alt":"S ⊂ X ×","inline":true,"padRight":true},{"text":"[0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1] and ","element":"span"},{"style":{"height":13.2},"width":109.03,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-18.png","element":"img","alt":" x ∈ X","inline":true},{"text":". If ","element":"span"},{"style":{"height":16},"width":89.2,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-19.png","element":"img","alt":" ∃y ∈","inline":true,"padRight":true},{"text":"[0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1] such that (","element":"span"},{"style":{"height":17.6},"width":168.69,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-20.png","element":"img","alt":"x, y) ∈ S","inline":true},{"text":", then we set ","element":"span"},{"style":{"height":17.6},"width":228.63,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-21.png","element":"img","alt":" ρ(S)(x) = y","inline":true},{"text":". Otherwise, we reorder ","element":"span"},{"style":{"fontStyle":"italic"},"text":"S ","element":"span"},{"text":"according to the total order on ","element":"span"},{"style":{"fontStyle":"italic"},"text":"X","element":"span"},{"text":", run the online learner ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"with an adversary that sequentially presents the learner with the reordered elements of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"S ","element":"span"},{"text":"that precede ","element":"span"},{"style":{"fontStyle":"italic"},"text":"x ","element":"span"},{"text":"in the total order on ","element":"span"},{"style":{"height":17.6},"width":331.95,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-22.png","element":"img","alt":" X, and let ρ(S)(x","inline":true},{"text":") be the value predicted by ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"for ","element":"span"},{"style":{"fontStyle":"italic"},"text":"x","element":"span"},{"text":".","element":"span"}],[{"text":"Now, we prove that ","element":"span"},{"style":{"height":15.2},"width":137.17,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-23.png","element":"img","alt":" κ and ρ","inline":true,"padRight":true},{"text":"as defined above indeed form a form a uniformly ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-24.png","element":"img","alt":" ε","inline":true},{"text":"-approximate sample compression scheme for ","element":"span"},{"style":{"fontStyle":"italic"},"text":"F","element":"span"},{"text":". That is, we show that, for all ","element":"span"},{"style":{"height":15.6},"width":116.42,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-25.png","element":"img","alt":" f ∈ F","inline":true,"padRight":true},{"text":"and for all ","element":"span"},{"style":{"height":17.85},"width":424.8,"height":44.62,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-26.png","element":"img","alt":" S = {(xi, f(xi))}mi=1 ⊂","inline":true},{"style":{"height":12.4},"width":74.77,"height":31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-27.png","element":"img","alt":"X ×","inline":true,"padRight":true},{"text":"[0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1], the function ","element":"span"},{"text":"ˆ","element":"span"},{"style":{"height":17.6},"width":200.36,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-28.png","element":"img","alt":"f = ρ(κ(S","inline":true},{"text":")) satisfies max","element":"span"},{"style":{"height":21.63},"width":473.94,"height":54.06,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-29.png","element":"img","alt":"1≤i≤m| ˆf(xi) − f(xi)| ≤ ε","inline":true},{"text":". So, let ","element":"span"},{"style":{"height":15.6},"width":120.53,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-30.png","element":"img","alt":" f ∈ F","inline":true,"padRight":true},{"text":"and let ","element":"span"},{"style":{"height":17.85},"width":524.27,"height":44.62,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-31.png","element":"img","alt":"S = {(xi, f(xi))}mi=1 ⊂ X ×","inline":true,"padRight":true},{"text":"[0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1]. If 1 ","element":"span"},{"style":{"height":14},"width":166.42,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-32.png","element":"img","alt":" ≤ i ≤ m","inline":true,"padRight":true},{"text":"is such that ","element":"span"},{"style":{"height":16},"width":103,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-33.png","element":"img","alt":" ∃yi ∈","inline":true,"padRight":true},{"text":"[0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1] : (","element":"span"},{"style":{"height":17.6},"width":240.3,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-34.png","element":"img","alt":"xi, yi) ∈ κ(S","inline":true},{"text":"), then by ","element":"span"},{"text":"definition of ","element":"span"},{"style":{"height":21},"width":905.32,"height":52.51,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-35.png","element":"img","alt":" κ and ρ we have ˆf(xi) = yi = f(xi). If 1 ≤ i ≤ m","inline":true,"padRight":true},{"text":"is such that ","element":"span"},{"style":{"height":19.6},"width":513.8,"height":49,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-36.png","element":"img","alt":" ∄yi ∈ [0, 1] : (xi, yi) ∈ κ(S),","inline":true,"padRight":true},{"text":"then by definition of ","element":"span"},{"style":{"height":8},"width":25,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-37.png","element":"img","alt":" κ","inline":true},{"text":", this means that the online learner ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"does not make an ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-38.png","element":"img","alt":" ε","inline":true},{"text":"-mistake on ","element":"span"},{"style":{"height":10.22},"width":36.94,"height":25.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/42-39.png","element":"img","alt":" xi","inline":true,"padRight":true},{"text":"when presented sequentially with the elements of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"S ","element":"span"},{"text":"reordered according to the total order on ","element":"span"},{"text":"X","element":"span"},{"text":". ","element":"span"},{"text":"As ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"is mistake-driven, this implies that ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"also does not make an ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-0.png","element":"img","alt":" ε","inline":true},{"text":"-mistake on ","element":"span"},{"style":{"height":10.22},"width":36.94,"height":25.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-1.png","element":"img","alt":" xi","inline":true,"padRight":true},{"text":"when presented sequentially only with the (reordered) elements of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"S ","element":"span"},{"text":"that precede ","element":"span"},{"style":{"height":10.22},"width":36.94,"height":25.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-2.png","element":"img","alt":" xi","inline":true,"padRight":true},{"text":"in the total order on ","element":"span"},{"text":"X ","element":"span"},{"text":"and on which ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"made an ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-3.png","element":"img","alt":" ε","inline":true},{"text":"-mistake. This is exactly the sequence of examples that ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"is run on when determining the value that ","element":"span"},{"text":"ˆ","element":"span"},{"style":{"height":17.6},"width":194.89,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-4.png","element":"img","alt":"f = ρ(κ(S","inline":true},{"text":")) assigns to ","element":"span"},{"style":{"height":21},"width":591.93,"height":52.51,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-5.png","element":"img","alt":" xi, thus | ˆf(xi) − f(xi)| ≤ ε. ■","inline":true}],[{"id":"id-178","style":{"fontWeight":"bold"},"text":"Remark 32. ","element":"span"},{"text":"Let us comment on a natural variant of Lemma ","element":"span"},{"href":"#id-174","text":"31","element":"a"},{"text":": If the online learner ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"for ","element":"span"},{"style":{"fontStyle":"italic"},"text":"F ","element":"span"},{"text":"makes at most ","element":"span"},{"style":{"height":17.2},"width":378.64,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-6.png","element":"img","alt":" M = MF(ε) many ε","inline":true},{"text":"-mistakes even when presented only with (","element":"span"},{"style":{"height":17.6},"width":42.34,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-7.png","element":"img","alt":"ε/","inline":true},{"text":"3)-accurate approximations to the true function values – as is for instance the case for our MWU-based Pauli channel online learner –, this translates over to the resulting compression scheme. That is, even when sequentially presented with a training data set ","element":"span"},{"style":{"height":17.85},"width":326.36,"height":44.62,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-8.png","element":"img","alt":" S = {(xi, yi)}mi=1","inline":true,"padRight":true},{"text":"with ","element":"span"},{"style":{"height":17.6},"width":329.42,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-9.png","element":"img","alt":" |yi − f(xi)| ≤ ε/","inline":true},{"text":"3 for all 1 ","element":"span"},{"style":{"height":14},"width":176.78,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-10.png","element":"img","alt":" ≤ i ≤ m","inline":true,"padRight":true},{"text":"(instead of a “perfect” data set with ","element":"span"},{"style":{"height":17.6},"width":174.08,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-11.png","element":"img","alt":" yi = f(xi","inline":true},{"text":") for all 1 ","element":"span"},{"style":{"height":14},"width":158.35,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-12.png","element":"img","alt":" ≤ i ≤ m","inline":true},{"text":"), the function ","element":"span"},{"text":"ˆ","element":"span"},{"style":{"height":17.6},"width":196.22,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-13.png","element":"img","alt":"f = ρ(κ(S","inline":true},{"text":")) after compression still satisfies ","element":"span"},{"style":{"height":21.01},"width":365.84,"height":52.52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-14.png","element":"img","alt":" | ˆf(xi) − f(xi)| ≤ ε","inline":true,"padRight":true},{"text":"for all 1 ","element":"span"},{"style":{"height":14},"width":169.84,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-15.png","element":"img","alt":" ≤ i ≤ m","inline":true},{"text":". Thus, this sample compression scheme is successful also if applied to data that has been affected by (possibly adversarial) label noise of strength ","element":"span"},{"style":{"height":17.6},"width":124.65,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-16.png","element":"img","alt":" ε/3. ◀","inline":true}],[{"text":"We can combine this with our mistake bound for Pauli channel learning to get a compression scheme for Pauli channels:","element":"span"}],[{"id":"id-179","style":{"fontWeight":"bold"},"text":"Corollary 33 ","element":"span"},{"text":"(Compression scheme for Pauli channels)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"The set ","element":"span"},{"style":{"height":15.02},"width":230.05,"height":37.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-17.png","element":"img","alt":" PAULIn of n","inline":true},{"text":"-qubit Pauli channels admits a uniformly ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-18.png","element":"img","alt":" ε","inline":true},{"text":"-approximate sample compression scheme of size ","element":"span"},{"style":{"height":28.8},"width":132.84,"height":72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-19.png","element":"img","alt":" O�nε2�","inline":true},{"text":". This sample compression scheme even succeeds on training data whose labels have been corrupted by adversarial label noise of strength ","element":"span"},{"style":{"height":17.6},"width":75.98,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-20.png","element":"img","alt":" ε/3.","inline":true}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Given our mistake bound for online learning Pauli channels (Corollary ","element":"span"},{"href":"#id-171","text":"24","element":"a"},{"text":") and Lemma ","element":"span"},{"href":"#id-174","text":"31 ","element":"a"},{"text":"(together with Remark ","element":"span"},{"href":"#id-178","text":"32","element":"a"},{"text":"), we only have to show that our instance space, which is the space of channel test operators, admits a total order. This can be seen as follows: Similar to Section ","element":"span"},{"href":"#id-103","text":"3.2","element":"a"},{"text":", we can associate to any channel test operator ","element":"span"},{"style":{"height":17.5},"width":93,"height":43.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-21.png","element":"img","alt":" EA,B","inline":true,"padRight":true},{"text":"the vector (","element":"span"},{"style":{"height":23.29},"width":763.96,"height":58.22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-22.png","element":"img","alt":"ez,x = Tr[EA,BΓz,xA,B])z,x∈{0,1}n ∈ [0, 1]4n.","inline":true,"padRight":true},{"text":"The mapping ","element":"span"},{"style":{"height":19.95},"width":437.41,"height":49.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-23.png","element":"img","alt":" EA,B �→ (ez,x)z,x∈{0,1}n","inline":true,"padRight":true},{"text":"is injective, because the operators Γ","element":"span"},{"style":{"height":22.14},"width":60.78,"height":55.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-24.png","element":"img","alt":"z,xA,B","inline":true,"padRight":true},{"text":"form an orthogonal ","element":"span"},{"text":"basis. Thus, any total order on [0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1]","element":"span"},{"style":{"height":8.9},"width":34.94,"height":22.25,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-25.png","element":"img","alt":"4n","inline":true},{"text":", for instance the lexicographic order, induces a total order on the set of channel test operators. ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-26.png","element":"img","alt":"■","inline":true}],[{"text":"Corollary ","element":"span"},{"href":"#id-179","text":"33 ","element":"a"},{"text":"implies that, if we care about the statistics of quantum experiments, then Pauli channels admit a significantly more parsimonious representation than via an exponentially long vector of Pauli error rates. This can be illustrated as follows: Suppose ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A","element":"span"},{"text":"(lice) and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"B","element":"span"},{"text":"(ob) want to understand how an unknown Pauli channel ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":"in ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A","element":"span"},{"text":"’s lab acts on the channel test operators ","element":"span"},{"style":{"height":26.85},"width":93,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-27.png","element":"img","alt":" E(i)A,B","inline":true},{"text":", ","element":"span"},{"text":"1 ","element":"span"},{"style":{"height":14},"width":158.28,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-28.png","element":"img","alt":" ≤ i ≤ m","inline":true},{"text":". To do so, ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"performs experiments and collects data ","element":"span"},{"style":{"height":26.85},"width":371.2,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-29.png","element":"img","alt":" S = {(E(i)A,B, yi)}mi=1","inline":true},{"text":", where the ","element":"span"},{"style":{"height":11.6},"width":33.4,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-30.png","element":"img","alt":" yi","inline":true,"padRight":true},{"text":"are (","element":"span"},{"style":{"height":17.6},"width":42.34,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-31.png","element":"img","alt":"ε/","inline":true},{"text":"3)-approximations of the corresponding expectation values Tr[","element":"span"},{"style":{"height":26.85},"width":188.08,"height":67.12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-32.png","element":"img","alt":"E(i)A,BCNA,B","inline":true},{"text":"]. She now wants ","element":"span"},{"text":"to communicate her findings to ","element":"span"},{"style":{"fontStyle":"italic"},"text":"B","element":"span"},{"text":". Then, no matter how large ","element":"span"},{"style":{"fontStyle":"italic"},"text":"m ","element":"span"},{"text":"is, ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"can compress ","element":"span"},{"style":{"fontStyle":"italic"},"text":"S ","element":"span"},{"text":"to a set of size at most ","element":"span"},{"style":{"height":28.8},"width":132.83,"height":72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-33.png","element":"img","alt":" O�nε2�","inline":true},{"text":"data points, send those to ","element":"span"},{"style":{"fontStyle":"italic"},"text":"B","element":"span"},{"text":", and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"B ","element":"span"},{"text":"can ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/43-34.png","element":"img","alt":" ε","inline":true},{"text":"-approximately reconstruct the expectation values for all ","element":"span"},{"style":{"fontStyle":"italic"},"text":"m ","element":"span"},{"text":"test operators without doing any further experiments.","element":"span"}]]},{"heading":"4 Mistake lower bounds","paragraphs":[[{"text":"In the previous section, we provided regret and mistake upper bounds for online learning certain subclasses of quantum channels and multi-time processes. In this section, we prove complementary mistake lower bounds. While these, in principle, lead to regret lower bounds via (the contrapositive of) Lemma ","element":"span"},{"href":"#id-83","text":"12","element":"a"},{"text":", we only discuss mistake lower bounds here. Throughout this section, our focus is on the dependence on ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-0.png","element":"img","alt":" ε","inline":true},{"text":"-mistake lower bounds for a constant ","element":"span"},{"style":{"height":17.6},"width":122.33,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-1.png","element":"img","alt":" ε < 1/","inline":true},{"text":"2. Thus, our lower bounds do not scale with ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-2.png","element":"img","alt":" ε","inline":true},{"text":". We conjecture that the (1","element":"span"},{"style":{"height":19.53},"width":59.17,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-3.png","element":"img","alt":"/ε2","inline":true},{"text":")-scaling achieved in Sections ","element":"span"},{"href":"#id-102","text":"3.1 ","element":"a"},{"text":"and ","element":"span"},{"href":"#id-103","text":"3.2 ","element":"a"},{"text":"is optimal, but leave the proof to future work.","element":"span"}],[{"id":"id-29","style":{"fontWeight":"bold"},"text":"4.1 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Mistake lower bounds for general unitaries and channels","element":"span"}],[{"text":"We first recall the folkore result that the class of arbitrary Boolean functions on ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"bits cannot be online learned with subexponentially many mistakes.","element":"span"}],[{"id":"id-181","style":{"fontWeight":"bold"},"text":"Lemma 34 ","element":"span"},{"text":"(Arbitrary Boolean functions cannot be online learned with subexponentially many mistakes)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":20.34},"width":296.08,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-4.png","element":"img","alt":" F = {0, 1}{0,1}n","inline":true,"padRight":true},{"text":"be the class of all Boolean functions on ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"bits. Any online learner for ","element":"span"},{"style":{"fontStyle":"italic"},"text":"F ","element":"span"},{"text":"makes at least 2","element":"span"},{"style":{"height":5.6},"width":21,"height":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-5.png","element":"img","alt":"n","inline":true,"padRight":true},{"text":"mistakes against a worst-case adversary. This remains true even if the adversary is forced to decide on a labeling function before the interaction with the learner.","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"The class ","element":"span"},{"style":{"fontStyle":"italic"},"text":"F ","element":"span"},{"text":"of all ","element":"span"},{"style":{"fontStyle":"italic"},"text":"{","element":"span"},{"text":"0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1","element":"span"},{"style":{"fontStyle":"italic"},"text":"}","element":"span"},{"text":"-valued functions on ","element":"span"},{"style":{"height":17.6},"width":126.79,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-6.png","element":"img","alt":" {0, 1}n ","inline":true,"padRight":true},{"text":"has VC-dimension [","element":"span"},{"href":"#id-180","referenceIndex":125,"text":"125","element":"a"},{"text":"] VCdim(","element":"span"},{"style":{"height":17.2},"width":152.22,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-7.png","element":"img","alt":"F) = 2n","inline":true,"padRight":true},{"text":"and thus Littlestone dimension [","element":"span"},{"href":"#id-20","referenceIndex":37,"text":"37","element":"a"},{"text":"] Ldim(","element":"span"},{"style":{"height":17.6},"width":422.06,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-8.png","element":"img","alt":"F) ≥ VCdim(F) = 2n","inline":true},{"text":". Essentially by definition of the Littlestone dimension, any online learner for ","element":"span"},{"style":{"fontStyle":"italic"},"text":"F ","element":"span"},{"text":"makes at least Ldim(","element":"span"},{"style":{"height":17.2},"width":153.04,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-9.png","element":"img","alt":"F) ≥ 2n ","inline":true,"padRight":true},{"text":"mistakes. This proves the first part of the statement.","element":"span"}],[{"text":"Now for the second part of the statement. Fix an arbitrary learning algorithm. Consider an adversary that initially chooses a function uniformly at random from ","element":"span"},{"style":{"height":20.34},"width":202.69,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-10.png","element":"img","alt":" {0, 1}{0,1}n","inline":true,"padRight":true},{"text":"and, in round 1 ","element":"span"},{"style":{"height":14.33},"width":175.18,"height":35.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-11.png","element":"img","alt":" ≤ t ≤ 2n","inline":true},{"text":", asks for the label of ","element":"span"},{"style":{"height":10.62},"width":40.75,"height":26.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-12.png","element":"img","alt":" xt","inline":true},{"text":", where ","element":"span"},{"style":{"height":19.67},"width":149.05,"height":49.17,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-13.png","element":"img","alt":" {xs}2ns=1","inline":true,"padRight":true},{"text":"is some (fixed) enumeration of ","element":"span"},{"style":{"height":17.6},"width":128.54,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-14.png","element":"img","alt":" {0, 1}n","inline":true},{"text":". Let ","element":"span"},{"style":{"fontStyle":"italic"},"text":"F ","element":"span"},{"text":"denote the function-valued random variable describing the function chosen by the learner, let ","element":"span"},{"style":{"height":18.77},"width":423.95,"height":46.92,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-15.png","element":"img","alt":"St = {(xτ, F(xτ))}tτ=1 ","inline":true,"padRight":true},{"text":"denote the instance-label pairs that the online learner has seen in the first ","element":"span"},{"style":{"fontStyle":"italic"},"text":"t ","element":"span"},{"text":"rounds. Moreover, let ","element":"span"},{"style":{"height":15.82},"width":80.9,"height":39.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-16.png","element":"img","alt":" Yt+1","inline":true,"padRight":true},{"text":"be the label predicted by the online learner in round ","element":"span"},{"style":{"fontStyle":"italic"},"text":"t ","element":"span"},{"text":"+ 1. Note that the random variable ","element":"span"},{"style":{"height":15.82},"width":80.9,"height":39.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-17.png","element":"img","alt":" Yt+1","inline":true,"padRight":true},{"text":"depends only on ","element":"span"},{"style":{"height":14.62},"width":38.76,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-18.png","element":"img","alt":" St","inline":true,"padRight":true},{"text":"(and on the internal randomness of the online learner). Thus, as ","element":"span"},{"style":{"height":17.6},"width":135.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-19.png","element":"img","alt":" F(xt+1","inline":true},{"text":") is uniformly random and independent of ","element":"span"},{"style":{"height":14.62},"width":38.76,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-20.png","element":"img","alt":" St","inline":true,"padRight":true},{"text":"(as well as of the online learning algorithm), also ","element":"span"},{"style":{"height":17.6},"width":317.65,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-21.png","element":"img","alt":" Yt+1 and F(xt+1","inline":true},{"text":") are independent. Therefore,","element":"span"}],[{"style":{"width":"76%"},"width":1437,"height":227,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-22.png","element":"img"}],[{"text":"Hence, in each round, the online learner makes a mistake with probability 1","element":"span"},{"style":{"fontStyle":"italic"},"text":"/","element":"span"},{"text":"2. As mistakes occur independently in each round, the probability that the online learner makes a mistake in every round is strictly greater than zero. Thus, there exists a function ","element":"span"},{"style":{"height":17.6},"width":369.37,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-23.png","element":"img","alt":" f : {0, 1}n → {0, 1}","inline":true,"padRight":true},{"text":"that, when chosen initially by the adversary, forces the online learner to make 2","element":"span"},{"style":{"height":12},"width":277.76,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/44-24.png","element":"img","alt":"n mistakes. ■","inline":true}],[{"text":"We can now embed the problem of online learning an arbitrary classical Boolean function into that of learning an arbitrary quantum channel and therefore inherit similar mistake lower bounds. In the following, we describe two different ways of achieving such lower bounds.","element":"span"}],[{"id":"id-188","style":{"fontWeight":"bold"},"text":"Corollary 35. ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":14.62},"width":52.52,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-0.png","element":"img","alt":" Un","inline":true,"padRight":true},{"text":"be the class of all unitary ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit channels, let ","element":"span"},{"style":{"height":17.6},"width":122.08,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-1.png","element":"img","alt":" ε < 1/","inline":true},{"text":"2. Any online learner for ","element":"span"},{"style":{"height":17.2},"width":461.32,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-2.png","element":"img","alt":" Un makes Ω(2n) many ε","inline":true},{"text":"-mistakes against a worst-case adversary. This remains true even if the adversary is forced to decide on a unitary before the interaction with the learner.","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Our proof is via reduction to Lemma ","element":"span"},{"href":"#id-181","text":"34","element":"a"},{"text":". To do so, we associate to every ","element":"span"},{"style":{"height":19.13},"width":407.65,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-3.png","element":"img","alt":" f : {0, 1}n−1 → {0, 1}","inline":true,"padRight":true},{"text":"the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit unitary ","element":"span"},{"style":{"height":17.64},"width":48.8,"height":44.11,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-4.png","element":"img","alt":" Uf","inline":true,"padRight":true},{"text":"defined via ","element":"span"},{"style":{"height":18.84},"width":442.74,"height":47.11,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-5.png","element":"img","alt":" Uf|x, b⟩ = |x, b ⊕ f(x)⟩","inline":true,"padRight":true},{"text":"for ","element":"span"},{"style":{"height":19.13},"width":256.16,"height":47.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-6.png","element":"img","alt":" x ∈ {0, 1}n−1","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":17.6},"width":186.22,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-7.png","element":"img","alt":" b ∈ {0, 1}","inline":true},{"text":". We denote the corresponding unitary channel by ","element":"span"},{"style":{"height":20.38},"width":66.1,"height":50.95,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-8.png","element":"img","alt":" Uf 8","inline":true},{"text":". Now, if we consider a channel test operator ","element":"span"},{"style":{"height":20.24},"width":1878.18,"height":50.6,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-9.png","element":"img","alt":"EA,B(x) = ρA(x)T ⊗MB(x) with ρA(x) = |x, 0⟩⟨x, 0| and MB(x) = |x, 1⟩⟨x, 1| for some x ∈ {0, 1}n−1,","inline":true,"padRight":true},{"text":"then","element":"span"}],[{"style":{"width":"81%"},"width":1529,"height":58,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-10.png","element":"img"}],[{"text":"Thus, if ","element":"span"},{"style":{"height":17.6},"width":122.5,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-11.png","element":"img","alt":" ε < 1/","inline":true},{"text":"2, then any online learner for ","element":"span"},{"style":{"height":14.62},"width":52.51,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-12.png","element":"img","alt":" Un","inline":true,"padRight":true},{"text":"that makes at most ","element":"span"},{"style":{"height":12},"width":75.37,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-13.png","element":"img","alt":" N ε","inline":true},{"text":"-mistakes gives rise to an online learner for ","element":"span"},{"style":{"height":21.64},"width":239.38,"height":54.09,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-14.png","element":"img","alt":" {0, 1}{0,1}n−1 ","inline":true,"padRight":true},{"text":"that makes at most ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":"mistakes, by rounding the produced estimates to obtain a label in ","element":"span"},{"style":{"fontStyle":"italic"},"text":"{","element":"span"},{"text":"0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1","element":"span"},{"style":{"fontStyle":"italic"},"text":"}","element":"span"},{"text":". Hence, by Lemma ","element":"span"},{"href":"#id-181","text":"34","element":"a"},{"text":", we conclude that ","element":"span"},{"style":{"height":19.13},"width":431.4,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-15.png","element":"img","alt":" N ≥ 2n−1 ≥ Ω(2n). ■","inline":true}],[{"id":"id-182","style":{"fontWeight":"bold"},"text":"Corollary 36. ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":15.02},"width":134.34,"height":37.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-16.png","element":"img","alt":" CPTPn","inline":true,"padRight":true},{"text":"be the class of all ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit quantum channels, let ","element":"span"},{"style":{"height":17.6},"width":122.59,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-17.png","element":"img","alt":" ε < 1/","inline":true},{"text":"2. Any online learner for ","element":"span"},{"style":{"height":17.2},"width":540.39,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-18.png","element":"img","alt":" CPTPn makes Ω(2n) many ε","inline":true},{"text":"-mistakes against a worst-case adversary. This remains true even if the adversary is forced to decide on a channel before the interaction with the learner.","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Our proof is via reduction to Lemma ","element":"span"},{"href":"#id-181","text":"34","element":"a"},{"text":". To do so, we associate to every ","element":"span"},{"style":{"height":17.6},"width":364.37,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-19.png","element":"img","alt":" f : {0, 1}n → {0, 1}","inline":true,"padRight":true},{"text":"the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit channel ","element":"span"},{"style":{"height":19.64},"width":54.8,"height":49.11,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-20.png","element":"img","alt":" Nf","inline":true,"padRight":true},{"text":"defined via","element":"span"}],[{"style":{"width":"76%"},"width":1437,"height":100,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-21.png","element":"img"}],[{"text":"Now, if we consider a channel test operator ","element":"span"},{"style":{"height":18.7},"width":514.48,"height":46.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-22.png","element":"img","alt":" EA,B(x) = ρA(x)T ⊗ MB(x","inline":true},{"text":") with ","element":"span"},{"style":{"height":17.6},"width":271.9,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-23.png","element":"img","alt":" ρA(x) = |x⟩⟨x|","inline":true,"padRight":true},{"text":"and","element":"span"}],[{"style":{"width":"99%"},"width":1872,"height":134,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-24.png","element":"img"}],[{"text":"Thus, if ","element":"span"},{"style":{"height":17.6},"width":131,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-25.png","element":"img","alt":" ε < 1/","inline":true},{"text":"2, then any online learner for ","element":"span"},{"style":{"height":22.12},"width":509.88,"height":55.31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-26.png","element":"img","alt":" CPTPn ⊃ {Nf}f∈{0,1}{0,1}n","inline":true,"padRight":true},{"text":"that makes at most ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-27.png","element":"img","alt":"ε","inline":true},{"text":"-mistakes gives rise to an online learner for ","element":"span"},{"style":{"height":20.33},"width":200.95,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-28.png","element":"img","alt":" {0, 1}{0,1}n ","inline":true,"padRight":true},{"text":"that makes at most ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":"mistakes, by rounding the produced estimates to obtain a label in ","element":"span"},{"style":{"fontStyle":"italic"},"text":"{","element":"span"},{"text":"0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1","element":"span"},{"style":{"fontStyle":"italic"},"text":"}","element":"span"},{"text":". Hence, by Lemma ","element":"span"},{"href":"#id-181","text":"34","element":"a"},{"text":", we conclude ","element":"span"},{"style":{"height":14.4},"width":219.88,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/45-29.png","element":"img","alt":" N ≥ 2n. ■","inline":true}],[{"text":"These lower bounds demonstrate that, unsurprisingly, arbitrary quantum channels cannot be online learned with a number of mistakes that scales efficiently with the system size. This should be contrasted with the case of states: Ref. [","element":"span"},{"href":"#id-9","referenceIndex":20,"text":"20","element":"a"},{"text":"] proved that we can online learn the class of all ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit states with a number of mistakes that grows only linearly in ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":". Moreover, the exponential mistake lower bounds above motivate the focus on restricted subclasses of channels, such as channels of bounded gate complexity or mixtures of known channels, which we considered in Sections ","element":"span"},{"href":"#id-102","text":"3.1 ","element":"a"},{"text":"and ","element":"span"},{"href":"#id-103","text":"3.2","element":"a"},{"text":". In the next two subsections, we prove mistake lower bounds to be juxtaposed with the upper bounds established in Sections ","element":"span"},{"href":"#id-102","text":"3.1 ","element":"a"},{"text":"and ","element":"span"},{"href":"#id-103","text":"3.2","element":"a"},{"text":".","element":"span"}],[{"id":"id-104","style":{"fontWeight":"bold"},"text":"4.2 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Mistake lower bounds for channels of bounded complexity","element":"span"}],[{"text":"Recall that Theorem ","element":"span"},{"href":"#id-150","text":"20 ","element":"a"},{"text":"and Corollary ","element":"span"},{"href":"#id-151","text":"21 ","element":"a"},{"text":"established regret and mistake upper bounds for online learning channels of gate complexity ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"that scaled effectively linearly in ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":". Our next result, which follows by combining Corollary ","element":"span"},{"href":"#id-182","text":"36 ","element":"a"},{"text":"with a “zooming in” on a suitable subset of qubits (as previously employed in Ref. [","element":"span"},{"href":"#id-14","referenceIndex":32,"text":"32","element":"a"},{"text":"]) shows that this scaling is essentially optimal.","element":"span"}],[{"id":"id-35","style":{"fontWeight":"bold"},"text":"Corollary 37 ","element":"span"},{"text":"(Essentially optimal scaling)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-0.png","element":"img","alt":" CPTPn,G","inline":true,"padRight":true},{"text":"be the class of all ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":"-gate channels, i.e., channels of gate complexity (at most) ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":", and let ","element":"span"},{"style":{"height":17.6},"width":128.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-1.png","element":"img","alt":" ε < 1/","inline":true},{"text":"2. Any online learner for ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-2.png","element":"img","alt":" CPTPn,G","inline":true,"padRight":true},{"text":"makes Ω(min","element":"span"},{"style":{"height":17.6},"width":142.28,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-3.png","element":"img","alt":"{2n, G}","inline":true},{"text":") many ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-4.png","element":"img","alt":" ε","inline":true},{"text":"-mistakes against a worst-case adversary. This remains true even if the adversary is forced to decide on a ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":"-gate channel before the interaction with the learner.","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"As we explain in Remark ","element":"span"},{"href":"#id-183","text":"40 ","element":"a"},{"text":"below, for ","element":"span"},{"style":{"height":14.4},"width":123.5,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-5.png","element":"img","alt":" G ≤ n","inline":true,"padRight":true},{"text":"the claimed lower bound follows from our analysis for Pauli channels. Therefore, for the rest of this proof, we consider only ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G > n","element":"span"},{"text":".","element":"span"}],[{"text":"Recall that there is a universal constant ","element":"span"},{"style":{"fontStyle":"italic"},"text":"C > ","element":"span"},{"text":"0 such that every Boolean function ","element":"span"},{"style":{"height":19.53},"width":245.98,"height":48.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-6.png","element":"img","alt":" f : {0, 1}k →","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"{","element":"span"},{"text":"0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1","element":"span"},{"style":{"fontStyle":"italic"},"text":"} ","element":"span"},{"text":"can be implemented with a de Morgan circuit of size at most ","element":"span"},{"href":"#id-184","referenceIndex":126,"style":{"height":19.54},"width":234.18,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-7.png","element":"img","alt":" C ·2k/k [126","inline":true},{"text":"]. Here, a de Morgan circuit is a circuit consisting of AND, OR, and NOT gates, where the AND and OR gates have fan-in two. Hence, as any classical two-bit gate can be implemented by a two-qubit quantum channel gate, and as a computational basis measurement on a single qubit corresponds to one single-qubit channel gate, we see that every ","element":"span"},{"style":{"fontStyle":"italic"},"text":"k","element":"span"},{"text":"-qubit channel ","element":"span"},{"style":{"height":19.64},"width":54.8,"height":49.11,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-8.png","element":"img","alt":" Nf","inline":true,"padRight":true},{"text":"as in the proof of Corollary ","element":"span"},{"href":"#id-182","text":"36","element":"a"},{"text":", with ","element":"span"},{"style":{"height":19.53},"width":374.12,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-9.png","element":"img","alt":" f : {0, 1}k → {0, 1},","inline":true,"padRight":true},{"text":"can be implemented with ","element":"span"},{"style":{"height":19.54},"width":406.47,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-10.png","element":"img","alt":" C · 2k/k + k ≤ C2k+1 ","inline":true,"padRight":true},{"text":"many two-qubit channel gates. Therefore, if we set ","element":"span"},{"style":{"height":20.75},"width":1184.8,"height":51.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-11.png","element":"img","alt":"k = ⌊log2(G/2C)⌋, then CPTPn,G ⊇ {Nf ⊗ idn−q}f:{0,1}q→{0,1}","inline":true},{"text":", where we consider the channels ","element":"span"},{"style":{"height":19.64},"width":54.8,"height":49.11,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-12.png","element":"img","alt":" Nf","inline":true,"padRight":true},{"text":"from the proof of Corollary ","element":"span"},{"href":"#id-182","text":"36 ","element":"a"},{"text":"and where we set ","element":"span"},{"style":{"fontStyle":"italic"},"text":"q ","element":"span"},{"text":"= min","element":"span"},{"style":{"fontStyle":"italic"},"text":"{","element":"span"},{"style":{"fontStyle":"italic"},"text":"n, k","element":"span"},{"style":{"fontStyle":"italic"},"text":"}","element":"span"},{"text":". We can now straightforwardly modify the states and effect operators used in the proof of Corollary ","element":"span"},{"href":"#id-182","text":"36 ","element":"a"},{"text":"(by attaching, say, the all-zero state on the last ","element":"span"},{"style":{"height":10.8},"width":99.55,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-13.png","element":"img","alt":" n − q","inline":true,"padRight":true},{"text":"qubits) to show that the ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-14.png","element":"img","alt":" ε","inline":true},{"text":"-mistake bound from Lemma ","element":"span"},{"href":"#id-181","text":"34","element":"a"},{"text":", with ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"replaced by ","element":"span"},{"style":{"fontStyle":"italic"},"text":"q","element":"span"},{"text":", applies to ","element":"span"},{"style":{"height":20.75},"width":483.74,"height":51.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-15.png","element":"img","alt":" {Nf ⊗ idn−q}f:{0,1}q→{0,1}","inline":true},{"text":". Because of the previously observed inclusion, we conclude that also ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-16.png","element":"img","alt":" CPTPn,G","inline":true,"padRight":true},{"text":"comes with an ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-17.png","element":"img","alt":" ε","inline":true},{"text":"-mistake lower bound of 2","element":"span"},{"style":{"height":17.6},"width":784.74,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-18.png","element":"img","alt":"q ≥ min{2n, G/4C} = Ω(min{2n, G}). ■","inline":true}],[{"text":"Corollary ","element":"span"},{"href":"#id-35","text":"37 ","element":"a"},{"text":"establishes a mistake lower bound for online learning the class ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-19.png","element":"img","alt":" CPTPn,G","inline":true,"padRight":true},{"text":"whose ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":"-dependence matches our upper bound up to logarithmic factors. However, as the construction in the proof above uses a measurement followed by an in general non-reversible classical circuit, it is far from unitary. Therefore, we next give an alternative proof for Corollary ","element":"span"},{"href":"#id-35","text":"37 ","element":"a"},{"text":"that, while still using non-unitary building blocks, is motivated by reversible computation and therefore can maybe serve as a stepping stone towards an analogue of Corollary ","element":"span"},{"href":"#id-35","text":"37 ","element":"a"},{"text":"for unitary channels.","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"Alternative proof of Corollary ","element":"span"},{"href":"#id-35","style":{"fontStyle":"italic"},"text":"37","element":"a"},{"style":{"fontStyle":"italic"},"text":". ","element":"span"},{"text":"As before, Ref. [","element":"span"},{"href":"#id-184","referenceIndex":126,"text":"126","element":"a"},{"text":"] tells us that every Boolean function ","element":"span"},{"style":{"fontStyle":"italic"},"text":"f ","element":"span"},{"text":": ","element":"span"},{"style":{"height":19.53},"width":308.72,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-20.png","element":"img","alt":"{0, 1}k → {0, 1}","inline":true,"padRight":true},{"text":"can be implemented with a de Morgan circuit of size at most ","element":"span"},{"style":{"height":19.53},"width":155.42,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-21.png","element":"img","alt":" C · 2k/k","inline":true},{"text":". As any OR gate can be rewritten in terms of three NOT gates and one AND gate, we can also achieve such implementations with circuit size ","element":"span"},{"style":{"height":19.53},"width":195.9,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/46-22.png","element":"img","alt":" C · 2k+2/k","inline":true,"padRight":true},{"text":"using only AND and NOT gates. The NOT gate can trivially be quantumly implemented by the Pauli-","element":"span"},{"style":{"fontStyle":"italic"},"text":"X ","element":"span"},{"text":"gate. Using what Ref. [","element":"span"},{"href":"#id-185","referenceIndex":127,"text":"127","element":"a"},{"text":"] called the AND/NAND gate, and which (with a different ordering convention for the inputs) is now known as the Toffoli gate, we can implement an AND with a reversible three-bit gate when the “source” is a suitable constant bit, thereby producing two garbage output bits in the “sink”. Therefore, we can quantumly implement any two-bit AND gate using one three-qubit unitary in conjunction with a ","element":"span"},{"text":"single-qubit channel gate that resets one of the “sink” qubits to the needed constant input, so that it can serve as a “source” for the next AND. The reset also allows us to use a single auxiliary qubit only throughout. Consequently, as any three-qubit unitary can be decomposed into a constant number of two-qubit unitaries, we can implement every function ","element":"span"},{"style":{"height":19.53},"width":366.18,"height":48.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-0.png","element":"img","alt":" f : {0, 1}k → {0, 1}","inline":true,"padRight":true},{"text":"with a (","element":"span"},{"style":{"fontStyle":"italic"},"text":"k ","element":"span"},{"text":"+ 1)-qubit quantum circuit of size ","element":"span"},{"style":{"height":19.53},"width":389.01,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-1.png","element":"img","alt":" C · 2k+3/k (where C","inline":true,"padRight":true},{"text":"is now a new constant).","element":"span"}],[{"text":"By this line of reasoning, if we set ","element":"span"},{"style":{"height":17.6},"width":344.54,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-2.png","element":"img","alt":" k = ⌊log2(G/8C)⌋","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":17.6},"width":341.69,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-3.png","element":"img","alt":" q = min{n − 1, k}","inline":true},{"text":", then ","element":"span"},{"style":{"height":17.9},"width":217.81,"height":44.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-4.png","element":"img","alt":" CPTPn,G ⊇","inline":true},{"style":{"height":20.75},"width":716.92,"height":51.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-5.png","element":"img","alt":"{Nf ⊗idn−(q+1) | f : {0, 1}q → {0, 1}}","inline":true},{"text":", where we abused notation—by not writing out the restriction to computational basis inputs and measurements on the first ","element":"span"},{"style":{"fontStyle":"italic"},"text":"q","element":"span"},{"text":"+1 qubits, and by ignoring the “source” and the “sink” subsystems at the output, which we can achieve by having identity tensor factors on the corresponding subsystems of the output effect operator. At this point, we again inherit a mistake lower bound from Corollary ","element":"span"},{"href":"#id-182","text":"36","element":"a"},{"text":", which here becomes 2","element":"span"},{"style":{"height":19.14},"width":846.79,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-6.png","element":"img","alt":"q = min{2n−1, G/16C} ≥ Ω(min{2n, G}). ■","inline":true}],[{"text":"This second proof already hints at a challenge in establishing the same Ω(min","element":"span"},{"style":{"height":17.6},"width":319.08,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-7.png","element":"img","alt":"{2n, G}) mistake","inline":true,"padRight":true},{"text":"lower bound for ","element":"span"},{"style":{"height":17.5},"width":88.48,"height":43.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-8.png","element":"img","alt":" Un,G","inline":true},{"text":", the class of all unitary ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit channels of gate complexity ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":". Namely, when aiming to implement a unitary ","element":"span"},{"style":{"height":18.84},"width":493.97,"height":47.11,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-9.png","element":"img","alt":" Uf for f : {0, 1}q → {0, 1}","inline":true},{"text":", a natural approach is to take a classical circuit implementation for ","element":"span"},{"style":{"fontStyle":"italic"},"text":"f ","element":"span"},{"text":"and make it reversible. Above, we relied on Toffoli’s construction to achieve this with a small overhead in gate complexity. However, since there we need a specific constant input in the “source”, this required us to reset (some of) our “sink” qubits. Such a reset is a non-reversible operation. Without the ability to reset, making the implementation reversible naively requires to add one auxiliary qubit per gate, thus exceeding the number of available qubits if ","element":"span"},{"style":{"height":14.4},"width":118.5,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-10.png","element":"img","alt":" G ≥ n","inline":true},{"text":". While we do not yet know how to overcome this obstacle when only using unitary gates, the following result at least demonstrates a lower bound for the unitary case that deviates from the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":"-dependence in the upper bound by only a square root.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Corollary 38 ","element":"span"},{"text":"(Lower bound for the unitary case)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":17.5},"width":88.48,"height":43.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-11.png","element":"img","alt":" Un,G","inline":true,"padRight":true},{"text":"be the class of all unitary ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit channels of gate complexity (at most) ","element":"span"},{"style":{"height":17.6},"width":240.72,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-12.png","element":"img","alt":" G, let ε < 1/","inline":true},{"text":"2. Any online learner for ","element":"span"},{"style":{"height":21.04},"width":543.95,"height":52.61,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-13.png","element":"img","alt":" Un,G makes Ω(min{2n,√G})","inline":true,"padRight":true},{"text":"many ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-14.png","element":"img","alt":" ε","inline":true},{"text":"-mistakes against a worst-case adversary. This remains true even if the adversary is forced to decide on a ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":"-gate channel before the interaction with the learner.","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"We first recall from Refs. [","element":"span"},{"href":"#id-186","referenceIndex":128,"text":"128","element":"a"},{"text":"–","element":"span"},{"href":"#id-187","referenceIndex":130,"text":"130","element":"a"},{"text":"]: There is a universal constant ","element":"span"},{"style":{"fontStyle":"italic"},"text":"C > ","element":"span"},{"text":"0 such that any ","element":"span"},{"style":{"fontStyle":"italic"},"text":"k","element":"span"},{"text":"-qubit unitary can be implemented with ","element":"span"},{"style":{"height":15.13},"width":73.69,"height":37.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-15.png","element":"img","alt":" C4k ","inline":true,"padRight":true},{"text":"many two-qubit gates. Thus, if we set ","element":"span"},{"style":{"height":17.6},"width":470.27,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-16.png","element":"img","alt":" k = ⌊log4(G/C)⌋, then G","inline":true,"padRight":true},{"text":"many two-qubit gates suffice to implement arbitrary unitaries on ","element":"span"},{"style":{"fontStyle":"italic"},"text":"k ","element":"span"},{"text":"qubits. In particular, this implies that ","element":"span"},{"style":{"height":17.5},"width":350.46,"height":43.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-17.png","element":"img","alt":" Un,G ⊃ Uq ⊗ idn−q","inline":true},{"text":", where ","element":"span"},{"style":{"fontStyle":"italic"},"text":"q ","element":"span"},{"text":"= min","element":"span"},{"style":{"fontStyle":"italic"},"text":"{","element":"span"},{"style":{"fontStyle":"italic"},"text":"n, k","element":"span"},{"style":{"fontStyle":"italic"},"text":"}","element":"span"},{"text":". Therefore, a quantum circuit with ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"many unitary 2-qubit gates is able to implement all the unitaries ","element":"span"},{"style":{"height":17.64},"width":48.79,"height":44.1,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-18.png","element":"img","alt":" Uf","inline":true,"padRight":true},{"text":"for functions ","element":"span"},{"style":{"height":19.14},"width":403.37,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-19.png","element":"img","alt":" f : {0, 1}q−1 → {0, 1}","inline":true,"padRight":true},{"text":"on the first ","element":"span"},{"style":{"fontStyle":"italic"},"text":"q ","element":"span"},{"text":"qubits. Hence, with a straightforward modification of the reasoning used in proving Corollary ","element":"span"},{"href":"#id-188","text":"35","element":"a"},{"text":"— tensoring the input states and output effects used there with, say, the all-zero state on the remaining ","element":"span"},{"style":{"height":10.8},"width":99.53,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-20.png","element":"img","alt":"n − q","inline":true,"padRight":true},{"text":"qubits—, we inherit the ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-21.png","element":"img","alt":" ε","inline":true},{"text":"-mistake bound from Lemma ","element":"span"},{"href":"#id-181","text":"34 ","element":"a"},{"text":"with ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"replaced by ","element":"span"},{"style":{"fontStyle":"italic"},"text":"q","element":"span"},{"text":". That is, we obtain a ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-22.png","element":"img","alt":" ε","inline":true},{"text":"-mistake lower bound of ","element":"span"},{"style":{"height":19.94},"width":1038.1,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/47-23.png","element":"img","alt":" ≥ 2q−1 = min{2n/2,�G/C/4} = Ω(min{2n,√G}). ■","inline":true}],[{"id":"id-105","style":{"fontWeight":"bold"},"text":"4.3 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Mistake lower bounds for Pauli channels","element":"span"}],[{"text":"To complement the linear-in-","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"mistake upper bound from Section ","element":"span"},{"href":"#id-103","text":"3.2","element":"a"},{"text":", we now give a mistake lower ","element":"span"},{"id":"id-37","text":"bound for online learning Pauli channels. Again, we obtain this as a consequence of Lemma ","element":"span"},{"href":"#id-181","text":"34","element":"a"},{"text":".","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Corollary 39 ","element":"span"},{"text":"(Mistake lower bound for online learning Pauli channels)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":15.02},"width":138.88,"height":37.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-0.png","element":"img","alt":" PAULIn","inline":true,"padRight":true},{"text":"be the class of all ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit Pauli channels, let ","element":"span"},{"style":{"height":17.6},"width":121.92,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-1.png","element":"img","alt":" ε < 1/","inline":true},{"text":"2. Any online learner for ","element":"span"},{"style":{"height":17.2},"width":699.75,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-2.png","element":"img","alt":" PAULIn makes Ω(n) many ε-mistakes","inline":true,"padRight":true},{"text":"against a worst-case adversary. This remains true even if the adversary is forced to decide on a Pauli channel before the interaction with the learner.","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Our proof is via reduction to Lemma ","element":"span"},{"href":"#id-181","text":"34","element":"a"},{"text":". To do so, we associate to any ","element":"span"},{"style":{"height":17.6},"width":424.21,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-3.png","element":"img","alt":" f : {1, . . . , n} → {0, 1}","inline":true,"padRight":true},{"text":"the unitary ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit Pauli channel ","element":"span"},{"style":{"height":19.64},"width":54.8,"height":49.11,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-4.png","element":"img","alt":" Nf","inline":true,"padRight":true},{"text":"defined via ","element":"span"},{"style":{"height":28.8},"width":713.71,"height":72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-5.png","element":"img","alt":" Nf(ρ) =��ni=1 Zf(i)i �ρ��ni=1 Zf(i)i �","inline":true},{"text":". Now, if we consider a channel test operator ","element":"span"},{"style":{"height":20.24},"width":1189.88,"height":50.59,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-6.png","element":"img","alt":" EA,B(t) = ρA(t)T ⊗ MB(t) with ρA(t) = |0t−1⟩⟨0t−1| ⊗ |+⟩⟨+| ⊗","inline":true}],[{"style":{"width":"99%"},"width":1868,"height":129,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-7.png","element":"img"}],[{"text":"Thus, if ","element":"span"},{"style":{"height":17.6},"width":121.92,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-8.png","element":"img","alt":" ε < 1/","inline":true},{"text":"2, then any online learner for ","element":"span"},{"style":{"height":15.02},"width":138.88,"height":37.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-9.png","element":"img","alt":" PAULIn","inline":true,"padRight":true},{"text":"that makes at most ","element":"span"},{"style":{"height":12},"width":72.62,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-10.png","element":"img","alt":" N ε","inline":true},{"text":"-mistakes gives rise to an online learner for ","element":"span"},{"style":{"height":20.33},"width":224.31,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-11.png","element":"img","alt":" {0, 1}{1,...,n} ","inline":true,"padRight":true},{"text":"that makes at most ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":"mistakes, by rounding the produced estimates to obtain a label in ","element":"span"},{"style":{"fontStyle":"italic"},"text":"{","element":"span"},{"text":"0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1","element":"span"},{"style":{"fontStyle":"italic"},"text":"}","element":"span"},{"text":". Hence, by Lemma ","element":"span"},{"href":"#id-181","text":"34","element":"a"},{"text":", we conclude ","element":"span"},{"style":{"height":20.33},"width":703.64,"height":50.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-12.png","element":"img","alt":" N ≥ 2⌊log(n)⌋ ≥ Ω(2log(n)) = Ω(n). ■","inline":true}],[{"text":"Corollary ","element":"span"},{"href":"#id-37","text":"39 ","element":"a"},{"text":"shows that the linear-in-","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"scaling achieved in Section ","element":"span"},{"href":"#id-103","text":"3.2 ","element":"a"},{"text":"is optimal for the special case of Pauli channel online learning. This also tells us that for learning a mixture of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"K ","element":"span"},{"text":"known channels, the log(","element":"span"},{"style":{"fontStyle":"italic"},"text":"K","element":"span"},{"text":")-dependence in the mistake bound can in general not be improved.","element":"span"}],[{"id":"id-183","style":{"fontWeight":"bold"},"text":"Remark 40. ","element":"span"},{"text":"The channels ","element":"span"},{"style":{"height":19.64},"width":54.8,"height":49.11,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-13.png","element":"img","alt":" Nf","inline":true,"padRight":true},{"text":"used in the proof of Corollary ","element":"span"},{"href":"#id-37","text":"39 ","element":"a"},{"text":"are unitary channels of gate complexity ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":". We can use essentially the same construction to show that for ","element":"span"},{"style":{"height":14.4},"width":122.49,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-14.png","element":"img","alt":" G ≤ n","inline":true},{"text":", any online learner for ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-15.png","element":"img","alt":" CPTPn,G","inline":true,"padRight":true},{"text":"makes at least Ω(","element":"span"},{"style":{"height":17.2},"width":204.33,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-16.png","element":"img","alt":"G) many ε","inline":true},{"text":"-mistakes if ","element":"span"},{"style":{"height":17.6},"width":122.21,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-17.png","element":"img","alt":" ε < 1/","inline":true},{"text":"2. To see this, consider unitary ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit channels of the form ","element":"span"},{"style":{"height":28.8},"width":1041.19,"height":72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-18.png","element":"img","alt":" ρ �→��Gi=1 Zf(i)i ⊗ 1⊗(n−G)2 �ρ��Gi=1 Zf(i)i ⊗ 1⊗(n−G)2 �","inline":true},{"text":"for functions ","element":"span"},{"style":{"height":17.6},"width":433.88,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-19.png","element":"img","alt":"f : {1, . . . , G} → {0, 1}","inline":true},{"text":", and argue as in the proof of Corollary ","element":"span"},{"href":"#id-37","text":"39","element":"a"},{"text":". ","element":"span"},{"style":{"height":10.4},"width":34,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-20.png","element":"img","alt":" ◀","inline":true}]]},{"heading":"5 Computational complexity lower bounds","paragraphs":[[{"id":"id-91","style":{"fontWeight":"bold"},"text":"5.1 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Computational complexity lower bounds for Pauli channels","element":"span"}],[{"text":"In Section ","element":"span"},{"text":"4","element":"span"},{"text":", we established mistake lower bounds for channels. ","element":"span"},{"text":"In this section, we focus on computational complexity lower bounds for online learning classes of quantum channels. In particular, we show that while we achieve polynomial mistake upper bounds for learning classes of quantum channels, the exponential computational complexity of our learning algorithms cannot be avoided under standard cryptographic assumptions. But let us first start with an ","element":"span"},{"style":{"fontStyle":"italic"},"text":"unconditional ","element":"span"},{"text":"computational hardness lower bound for online learning Pauli channels. The heart of the proof strategy lies in exploiting the fact that any polynomial time learner is only ever able to access polynomially many entries of an exponentially sized input. By the adversarial nature of the game, the adversary can always ‘hide’ the information useful for answering a challenge in an entry that was never seen by the (polynomial-time) learner. We make this intuition formal in the following result.","element":"span"}],[{"id":"id-189","style":{"fontWeight":"bold"},"text":"Theorem 41. ","element":"span"},{"text":"Consider any polynomial-time online learner of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit Pauli channels that runs in time ","element":"span"},{"style":{"height":20.33},"width":104.93,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-21.png","element":"img","alt":" q(t)(n","inline":true},{"text":") at time step ","element":"span"},{"style":{"height":17.6},"width":328.19,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-22.png","element":"img","alt":" t ∈ {1, 2, . . . , T}","inline":true},{"text":", for any polynomials ","element":"span"},{"style":{"height":20.33},"width":487.75,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-23.png","element":"img","alt":" q(1)(n), q(2)(n), . . . , q(T)(n","inline":true},{"text":"). There exists an explicit adversarial strategy that forces the learner to make ","element":"span"},{"style":{"height":24.19},"width":80.68,"height":60.48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-24.png","element":"img","alt":"4n−1Q","inline":true,"padRight":true},{"text":"mistakes, where ","element":"span"},{"style":{"height":20.33},"width":683.75,"height":50.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/48-25.png","element":"img","alt":"Q = min{q(t)(n) : t ∈ {1, 2, . . . , T}}.","inline":true}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Recall that for any unknown Pauli channel with associated Choi representation ","element":"span"},{"style":{"height":17.5},"width":95.84,"height":43.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-0.png","element":"img","alt":" NA,B","inline":true,"padRight":true},{"text":"predicting Tr[","element":"span"},{"style":{"height":26.85},"width":191.95,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-1.png","element":"img","alt":"E(t)A,BNA,B","inline":true},{"text":"] (the task of the online learner) for a given channel observable ","element":"span"},{"style":{"height":26.85},"width":93,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-2.png","element":"img","alt":" E(t)A,B","inline":true,"padRight":true},{"text":"is ","element":"span"},{"text":"exactly equivalent to predicting ","element":"span"},{"style":{"height":19.93},"width":123.04,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-3.png","element":"img","alt":" e(t) · p","inline":true},{"text":", where ","element":"span"},{"style":{"fontStyle":"italic","fontWeight":"bold"},"text":"p ","element":"span"},{"text":"is the (unknown) error-rate distribution and the vector ","element":"span"},{"style":{"height":25.55},"width":395.82,"height":63.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-4.png","element":"img","alt":" e(t) = (e(t)z,x)z,x∈{0,1}n","inline":true,"padRight":true},{"text":"has entries ","element":"span"},{"style":{"height":26.85},"width":806.57,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-5.png","element":"img","alt":" e(t)z,x := Tr[E(t)A,BΓz,xA,B] for all z, x ∈ {0, 1}n.","inline":true}],[{"text":"We work in a simplified setting of a ","element":"span"},{"style":{"fontStyle":"italic"},"text":"{","element":"span"},{"text":"0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1","element":"span"},{"style":{"fontStyle":"italic"},"text":"}","element":"span"},{"text":"-valued game, in which the learner does not have to evaluate entries ","element":"span"},{"style":{"height":26.85},"width":381.4,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-6.png","element":"img","alt":" e(t)z,x := Tr[E(t)A,BΓz,xA,B","inline":true},{"text":"] of the vector ","element":"span"},{"style":{"height":25.55},"width":396.53,"height":63.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-7.png","element":"img","alt":" e(t) = (e(t)z,x)z,x∈{0,1}n","inline":true,"padRight":true},{"text":"for all ","element":"span"},{"style":{"height":17.6},"width":255.68,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-8.png","element":"img","alt":" z, x ∈ {0, 1}n","inline":true},{"text":"; the ","element":"span"},{"text":"learner directly receives the vector ","element":"span"},{"style":{"height":16.33},"width":62.57,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-9.png","element":"img","alt":" e(t) ","inline":true,"padRight":true},{"text":"(which we shall refer to as the ‘challenge’ vector), instead of the test operators ","element":"span"},{"style":{"height":26.85},"width":93,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-10.png","element":"img","alt":" E(t)A,B","inline":true},{"text":". The learning task is still not obviously computationally tractable, because ","element":"span"},{"text":"the challenge vector is of size 4","element":"span"},{"style":{"height":5.6},"width":21,"height":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-11.png","element":"img","alt":"n","inline":true},{"text":". We prove that a polynomial-time learner, who by definition only looks at polynomially-many entries of ","element":"span"},{"style":{"height":16.33},"width":62.56,"height":40.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-12.png","element":"img","alt":" e(t)","inline":true},{"text":", can be forced to make ","element":"span"},{"text":"˜","element":"span"},{"text":"Ω(4","element":"span"},{"style":{"height":5.6},"width":21,"height":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-13.png","element":"img","alt":"n","inline":true},{"text":") many mistakes by a simple adversarial strategy. The adversarial strategy is to always play ","element":"span"},{"style":{"height":20.33},"width":376.84,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-14.png","element":"img","alt":" e(t) = 0 = (0, 0, . . . ,","inline":true,"padRight":true},{"text":"0), i.e., the all-zeros vector, corresponding to ","element":"span"},{"style":{"height":26.85},"width":208.27,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-15.png","element":"img","alt":" E(t)A,B = 0.9","inline":true,"padRight":true},{"text":"We now show that by using the same all-zeros ","element":"span"},{"text":"challenge vector in every round, the adversary can always contradict the learner’s prediction, and thereby claim that they made a mistake. In any ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T","element":"span"},{"text":"-round interaction with the adversary, let the (deterministic) learner predict ","element":"span"},{"style":{"height":17.6},"width":203.56,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-16.png","element":"img","alt":" yt ∈ {0, 1}","inline":true,"padRight":true},{"text":"in round ","element":"span"},{"style":{"height":17.6},"width":312.23,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-17.png","element":"img","alt":" t ∈ {1, 2, . . . , T}","inline":true},{"text":". In response, the adversary claims the correct answer to be ","element":"span"},{"style":{"height":15.6},"width":160.99,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-18.png","element":"img","alt":" bt = ¬yt","inline":true},{"text":". In other words, the adversary always contradicts the learner’s prediction, and thereby forces the learner to make a mistake. Note that this contradictory feedback indeed constitutes a mistake when we take the loss function to be ","element":"span"},{"style":{"height":17.6},"width":475.04,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-19.png","element":"img","alt":" ℓt(yt) := |yt −bt|, because","inline":true},{"style":{"height":17.6},"width":910.22,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-20.png","element":"img","alt":"ℓt(yt) = |yt − (¬yt)| = 1 for all t ∈ {1, 2, . . . , T}.","inline":true}],[{"text":"Let us now prove that after the end of the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T ","element":"span"},{"text":"rounds, the adversary can claim to be “consistent”, despite their contradictory feedback, as long as it does not contradict the entries that the learner has seen. In other words, after the end of the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T ","element":"span"},{"text":"rounds, the adversary can always exhibit a ","element":"span"},{"style":{"height":16.8},"width":170.22,"height":42,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-21.png","element":"img","alt":" p∗ ∈ ∆4n","inline":true,"padRight":true},{"text":"and challenge vectors ˜","element":"span"},{"style":{"height":16.34},"width":62.56,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-22.png","element":"img","alt":"e(t)","inline":true,"padRight":true},{"text":"such that, for all ","element":"span"},{"style":{"height":20.34},"width":593.8,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-23.png","element":"img","alt":" t ∈ {1, 2, . . . , T}, ¬yt = p∗ · ˜e(t)","inline":true,"padRight":true},{"text":"and such that ˜","element":"span"},{"style":{"height":16.34},"width":62.56,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-24.png","element":"img","alt":"e(t)","inline":true,"padRight":true},{"text":"has 0-entries at the positions of the challenge vector that the online learner accessed in round ","element":"span"},{"style":{"fontStyle":"italic"},"text":"t","element":"span"},{"text":". The key to proving this is the fact that the learner is computationally bounded, and therefore, they can only query polynomially-many entries of ","element":"span"},{"style":{"height":16.33},"width":62.57,"height":40.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-25.png","element":"img","alt":" e(t) ","inline":true,"padRight":true},{"text":"in every round.","element":"span"}],[{"text":"In any given round, the adversarial feedback, ","element":"span"},{"style":{"height":11.6},"width":62.48,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-26.png","element":"img","alt":" ¬yt","inline":true},{"text":", is either 0 or 1. For ","element":"span"},{"style":{"height":15.6},"width":434.24,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-27.png","element":"img","alt":" ¬yt = 0, the adversary","inline":true,"padRight":true},{"text":"claims to actually have played the all-zero challenge vector. For the rounds where the adversary claimed ","element":"span"},{"style":{"height":15.6},"width":1712.94,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-28.png","element":"img","alt":" ¬yt = 1, the adversary needs to demonstrate that, while the learner only saw all zeroes in","inline":true,"padRight":true},{"text":"such rounds, there was in fact a non-zero entry in ","element":"span"},{"style":{"height":16.33},"width":62.57,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-29.png","element":"img","alt":" e(t)","inline":true,"padRight":true},{"text":"(that the learner did not look at) and that this entry leads to ","element":"span"},{"style":{"height":19.93},"width":1521.24,"height":49.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-30.png","element":"img","alt":" p∗ · e(t) = 1. In fact, as we argue below, it suffices for the adversary to only claim","inline":true,"padRight":true},{"text":"that ","element":"span"},{"style":{"height":16.33},"width":62.56,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-31.png","element":"img","alt":" e(t)","inline":true,"padRight":true},{"text":"had a single non-zero entry. Vectors ","element":"span"},{"style":{"height":16.33},"width":62.57,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-32.png","element":"img","alt":" e(t)","inline":true,"padRight":true},{"text":"with this structure are indeed realized in our learning scenario by choosing ","element":"span"},{"style":{"height":26.85},"width":320.37,"height":67.12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-33.png","element":"img","alt":" E(t)A,B = Γz′,x′A,B /4n","inline":true,"padRight":true},{"text":"for some ","element":"span"},{"style":{"height":17.6},"width":278.23,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-34.png","element":"img","alt":" z′, x′ ∈ {0, 1}n","inline":true},{"text":", because ","element":"span"},{"style":{"height":22.14},"width":302.41,"height":55.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-35.png","element":"img","alt":" {Γz,xA,B}z,x∈{0,1}n","inline":true,"padRight":true},{"text":"forms an orthogonal basis and Tr[(Γ","element":"span"},{"style":{"height":23.19},"width":212.24,"height":57.97,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-36.png","element":"img","alt":"z,xA,B)2] = 4n","inline":true},{"text":". Thus, choosing ","element":"span"},{"style":{"height":26.85},"width":318.46,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-37.png","element":"img","alt":" E(t)A,B = Γz′,x′A,B /4n","inline":true,"padRight":true},{"text":"ensures that ","element":"span"},{"style":{"height":16.33},"width":62.56,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-38.png","element":"img","alt":" e(t)","inline":true,"padRight":true},{"text":"is has 4","element":"span"},{"style":{"height":11.93},"width":66.24,"height":29.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-39.png","element":"img","alt":"n −","inline":true,"padRight":true},{"text":"1 entries equal to 0 and a single entry ","element":"span"},{"style":{"height":13.02},"width":69.77,"height":32.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-40.png","element":"img","alt":" ez,x","inline":true,"padRight":true},{"text":"equal to 1, when ","element":"span"},{"style":{"height":12.4},"width":363.34,"height":31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-41.png","element":"img","alt":" z = z′ and x = x′.","inline":true}],[{"text":"Let us partition the set ","element":"span"},{"style":{"fontStyle":"italic"},"text":"R ","element":"span"},{"text":":= ","element":"span"},{"style":{"fontStyle":"italic"},"text":"{","element":"span"},{"text":"1","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"2","element":"span"},{"style":{"fontStyle":"italic"},"text":", . . . , T","element":"span"},{"style":{"fontStyle":"italic"},"text":"} ","element":"span"},{"text":"into two disjoint subsets as ","element":"span"},{"style":{"height":14.62},"width":242.77,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-42.png","element":"img","alt":" R = R0 ∪ R1","inline":true},{"text":", where ","element":"span"},{"style":{"height":14.62},"width":50.14,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-43.png","element":"img","alt":" R0","inline":true,"padRight":true},{"text":"is the set of time steps in which the adversary claimed the correct answer to be 0 and ","element":"span"},{"style":{"height":14.62},"width":50.14,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-44.png","element":"img","alt":" R1","inline":true,"padRight":true},{"text":"is the set of time steps in which they claimed the correct answer to be 1. Let ","element":"span"},{"style":{"height":20.34},"width":104.81,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-45.png","element":"img","alt":" q(t)(n","inline":true},{"text":") be the polynomial number of entries of ","element":"span"},{"style":{"height":16.33},"width":62.56,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-46.png","element":"img","alt":" e(t) ","inline":true,"padRight":true},{"text":"accessed by the learner at time step ","element":"span"},{"style":{"fontStyle":"italic"},"text":"t","element":"span"},{"text":". Also, let ","element":"span"},{"style":{"height":17.6},"width":522.96,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/49-47.png","element":"img","alt":" It ⊆ {0, 1}n ×{0, 1}n be the","inline":true,"padRight":true},{"text":"indices corresponding to the entries of ","element":"span"},{"style":{"height":16.33},"width":62.56,"height":40.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-0.png","element":"img","alt":" e(t) ","inline":true,"padRight":true},{"text":"accessed by the learner in round ","element":"span"},{"style":{"fontStyle":"italic"},"text":"t","element":"span"},{"text":". To be consistent, the adversary sets to 0 all entries of ","element":"span"},{"style":{"height":19.93},"width":206.98,"height":49.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-1.png","element":"img","alt":" e(t) and p∗ ","inline":true,"padRight":true},{"text":"corresponding to the indices in (","element":"span"},{"style":{"height":19.38},"width":416.72,"height":48.44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-2.png","element":"img","alt":"�t∈R0 It) ∪ (�t∈R1 It),","inline":true,"padRight":true},{"text":"i.e., all entries that the learner saw. Then, retroactively, for the rounds in which they claimed 1 to be the correct answer, the adversary can always claim that the 1-entry in both ","element":"span"},{"style":{"height":16.33},"width":62.57,"height":40.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-3.png","element":"img","alt":" e(t)","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":16.33},"width":43.22,"height":40.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-4.png","element":"img","alt":" p∗","inline":true,"padRight":true},{"text":"was an entry that the learner never saw, i.e., an entry whose index is in","element":"span"},{"style":{"height":20.65},"width":582.51,"height":51.64,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-5.png","element":"img","alt":"�(�t∈R0 It) ∪ (�t∈R1 It)�c. It is","inline":true,"padRight":true},{"text":"sufficient that there exists at least one such index. Then, this adversarial strategy works as long as the number of entries seen by the learner does not exceed 4","element":"span"},{"style":{"height":11.93},"width":66.24,"height":29.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-6.png","element":"img","alt":"n −","inline":true,"padRight":true},{"text":"1 (in order to account for at least one entry that the learner has not seen). In other words,","element":"span"}],[{"style":{"width":"67%"},"width":1258,"height":229,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-7.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":17.6},"width":81.72,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-8.png","element":"img","alt":" q0(n","inline":true},{"text":") :","element":"span"},{"style":{"height":20.33},"width":248.84,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-9.png","element":"img","alt":"= min{q(t)(n","inline":true},{"text":") : ","element":"span"},{"style":{"height":17.6},"width":147.78,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-10.png","element":"img","alt":" t ∈ R0}","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":17.6},"width":81.72,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-11.png","element":"img","alt":" q1(n","inline":true},{"text":") :","element":"span"},{"style":{"height":20.33},"width":248.84,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-12.png","element":"img","alt":"= min{q(t)(n","inline":true},{"text":") : ","element":"span"},{"style":{"height":17.6},"width":147.78,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-13.png","element":"img","alt":" t ∈ R1}","inline":true},{"text":", and the final inequality holds because ","element":"span"},{"style":{"height":17.6},"width":455.39,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-14.png","element":"img","alt":" q0(n) ≥ Q, q1(n) ≥ Q","inline":true,"padRight":true},{"text":"(recall that ","element":"span"},{"style":{"height":20.34},"width":320.96,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-15.png","element":"img","alt":" Q := min{q(t)(n","inline":true},{"text":") : ","element":"span"},{"style":{"height":17.6},"width":345.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-16.png","element":"img","alt":" t ∈ {1, 2, . . . , T}}","inline":true},{"text":"), and ","element":"span"},{"style":{"height":17.6},"width":319.62,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-17.png","element":"img","alt":"|R0| + |R1| = T","inline":true},{"text":". Thus, the adversary can force the learner to make mistakes for ","element":"span"},{"style":{"height":24.19},"width":80.68,"height":60.48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-18.png","element":"img","alt":"4n−1Q","inline":true,"padRight":true},{"text":"many rounds. ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-19.png","element":"img","alt":"■","inline":true}],[{"style":{"fontWeight":"bold"},"text":"Remark 42. ","element":"span"},{"text":"The computational complexity of MWU (Algorithm ","element":"span"},{"href":"#id-134","text":"1","element":"a"},{"text":") for online learning convex combinations of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"K ","element":"span"},{"text":"known channels scales polynomially with ","element":"span"},{"style":{"fontStyle":"italic"},"text":"K","element":"span"},{"text":", which in the worst case could be exponential in the number of qubits, as is the case for general Pauli channels. If, however, the learner is given challenge vectors ","element":"span"},{"style":{"height":16.33},"width":62.56,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-20.png","element":"img","alt":" e(t) ","inline":true,"padRight":true},{"text":"with entries ","element":"span"},{"style":{"height":26.85},"width":693.02,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-21.png","element":"img","alt":" e(t)z,x := Tr[E(t)A,BΓz,xA,B] that are poly(n","inline":true},{"text":")-sparse (with known ","element":"span"},{"text":"sparsity structure), then an online learner can learn such a channel computationally efficiently and also saturate optimal regret and mistake bounds using MWU (Algorithm ","element":"span"},{"href":"#id-134","text":"1","element":"a"},{"text":"). A relevant example (from quantum error correction) of a class of channels that can be written as convex combinations of polynomially many known channels is that of polynomially-sparse Pauli channels with a known sparsity structure. Hence, our results imply that this class of channels is computationally efficiently online learnable with regret and mistake bounds that scale with log(","element":"span"},{"style":{"height":17.6},"width":103.83,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-22.png","element":"img","alt":"n). ◀","inline":true}],[{"style":{"fontWeight":"bold"},"text":"Remark 43. ","element":"span"},{"text":"The proof strategy in Theorem ","element":"span"},{"href":"#id-189","text":"41 ","element":"a"},{"text":"straightforwardly implies that any polynomial time learner for online learning quantum ","element":"span"},{"style":{"fontStyle":"italic"},"text":"states ","element":"span"},{"text":"(in the sense defined in Ref. [","element":"span"},{"href":"#id-9","referenceIndex":20,"text":"20","element":"a"},{"text":"]) can be forced to make exponentially many ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-23.png","element":"img","alt":" ε","inline":true},{"text":"-mistakes as long as the ‘challenge’ effect operators admit exponentially long descriptions. To see this, write, in its spectral decomposition, any ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit state ","element":"span"},{"style":{"height":20.59},"width":356.58,"height":51.48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-24.png","element":"img","alt":" ρ = �2ni=1 pi|ψi⟩⟨ψi|","inline":true,"padRight":true},{"text":"that a learner wishes online learn. For the lower bound, it suffices to work in a simpler scenario in which the learner knows the eigenbasis ","element":"span"},{"style":{"height":19.88},"width":169.54,"height":49.7,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-25.png","element":"img","alt":" {|ψi⟩}2ni=1","inline":true,"padRight":true},{"text":"in advance but not the eigenvalues ","element":"span"},{"style":{"height":19.88},"width":133.98,"height":49.7,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-26.png","element":"img","alt":" {pi}2ni=1","inline":true},{"text":". For ","element":"span"},{"text":"every effect operator ","element":"span"},{"style":{"fontStyle":"italic"},"text":"E","element":"span"},{"text":", we have Tr[","element":"span"},{"style":{"height":20.59},"width":453.02,"height":51.48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-27.png","element":"img","alt":"ρE] = �2ni=1 pi⟨ψi|E|ψi⟩","inline":true},{"text":". Defining a vector ","element":"span"},{"style":{"height":19.88},"width":212.39,"height":49.7,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-28.png","element":"img","alt":" e = (ei)2ni=1","inline":true,"padRight":true},{"text":"with ","element":"span"},{"text":"entries ","element":"span"},{"style":{"height":17.6},"width":269.4,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-29.png","element":"img","alt":" ei = ⟨ψi|E|ψi⟩","inline":true},{"text":", we can rewrite this as Tr[","element":"span"},{"style":{"height":17.6},"width":209.98,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-30.png","element":"img","alt":"ρE] = p · e","inline":true},{"text":", where ","element":"span"},{"style":{"height":19.88},"width":209.78,"height":49.7,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-31.png","element":"img","alt":" p = (pi)2ni=1","inline":true},{"text":". Since ","element":"span"},{"style":{"fontStyle":"italic","fontWeight":"bold"},"text":"e ","element":"span"},{"text":"is still an ","element":"span"},{"text":"exponentially long challenge vector, the strategy in the proof of Theorem ","element":"span"},{"href":"#id-189","text":"41 ","element":"a"},{"text":"suffices to make the online learner make exponentially many mistakes. ","element":"span"},{"style":{"height":10.4},"width":34,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-32.png","element":"img","alt":" ◀","inline":true}],[{"id":"id-89","style":{"fontWeight":"bold"},"text":"5.2 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Computational complexity lower bounds for channels of bounded complexity","element":"span"}],[{"text":"The online learner for ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-33.png","element":"img","alt":" CPTPn,G","inline":true,"padRight":true},{"text":"presented in Section ","element":"span"},{"href":"#id-102","text":"3.1 ","element":"a"},{"text":"achieves good regret and ","element":"span"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/50-34.png","element":"img","alt":" ε","inline":true},{"text":"-mistake bounds, but is computationally inefficient. In this section, we prove that, under a widely held cryptographic hardness assumption, namely hardness of ","element":"span"},{"text":"RingLWE","element":"span"},{"text":", there cannot be a computationally efficient online learning algorithm for ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-0.png","element":"img","alt":" CPTPn,G","inline":true,"padRight":true},{"text":"that achieves favorably scaling mistake bounds. Via Lemma ","element":"span"},{"href":"#id-83","text":"12","element":"a"},{"text":", this also implies that good regret bounds for online learning ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-1.png","element":"img","alt":" CPTPn,G","inline":true,"padRight":true},{"text":"cannot be achieved computationally efficiently, but we again focus on mistake bounds here. Our proof, which is conceptually analogous to arguments in [","element":"span"},{"href":"#id-14","referenceIndex":32,"text":"32","element":"a"},{"text":", ","element":"span"},{"href":"#id-190","referenceIndex":131,"text":"131","element":"a"},{"text":"], is yet another instance of the well known fact that pseudorandom functions cannot be learned efficiently, which we phrase in an online learning framework.","element":"span"}],[{"text":"First, we recall the definition of pseudorandom functions.","element":"span"}],[{"id":"id-193","style":{"fontWeight":"bold"},"text":"Definition 44 ","element":"span"},{"text":"(Pseudorandom functions (PRFs) [","element":"span"},{"href":"#id-191","referenceIndex":132,"text":"132","element":"a"},{"text":"])","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":12},"width":26,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-2.png","element":"img","alt":" λ","inline":true,"padRight":true},{"text":"be a security parameter. Let ","element":"span"},{"style":{"fontStyle":"italic"},"text":"K ","element":"span"},{"text":"= ","element":"span"},{"style":{"height":17.6},"width":163.98,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-3.png","element":"img","alt":"{Kλ}λ∈N","inline":true,"padRight":true},{"text":"be key space, assumed to be efficiently sampleable. Let ","element":"span"},{"style":{"height":17.6},"width":438.17,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-4.png","element":"img","alt":" X = {Xλ}λ∈N, {Yλ}λ∈N","inline":true,"padRight":true},{"text":"be families of finite sets. Let ","element":"span"},{"style":{"height":17.6},"width":246.12,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-5.png","element":"img","alt":" F = {fλ}λ∈N","inline":true,"padRight":true},{"text":"be a family of efficiently-computable functions ","element":"span"},{"style":{"height":16},"width":357.89,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-6.png","element":"img","alt":" fλ : Kλ × Xλ → Yλ","inline":true},{"text":", where the input from ","element":"span"},{"style":{"height":15.24},"width":53.25,"height":38.1,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-7.png","element":"img","alt":" Kλ","inline":true,"padRight":true},{"text":"corresponds to the function key. The family ","element":"span"},{"style":{"fontStyle":"italic"},"text":"F ","element":"span"},{"text":"is a ","element":"span"},{"style":{"fontStyle":"italic"},"text":"pseudorandom function (family) secure against (classical) ","element":"span"},{"style":{"height":17.2},"width":74.84,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-8.png","element":"img","alt":" t(λ)","inline":true},{"style":{"fontStyle":"italic"},"text":"-time adversaries ","element":"span"},{"text":"if for every ","element":"span"},{"style":{"height":17.2},"width":58.39,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-9.png","element":"img","alt":" t(λ","inline":true},{"text":")-time probabilistic algorithm ","element":"span"},{"text":"Adv","element":"span"},{"text":", there exists a negligible function ","element":"span"},{"style":{"height":17.6},"width":103.12,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-10.png","element":"img","alt":" negl(·","inline":true},{"text":")—that is, a function satisfying ","element":"span"},{"style":{"height":27.46},"width":187.6,"height":68.66,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-11.png","element":"img","alt":"negl(λ)p(λ) = o","inline":true},{"text":"(1) for every polynomial ","element":"span"},{"style":{"fontStyle":"italic"},"text":"p","element":"span"},{"text":"—such that, for every security parameter ","element":"span"},{"style":{"height":12.8},"width":110.79,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-12.png","element":"img","alt":" λ ∈ N","inline":true},{"text":", it holds that","element":"span"}],[{"style":{"width":"77%"},"width":1451,"height":146,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-13.png","element":"img"}],[{"text":"where the key ","element":"span"},{"style":{"fontWeight":"bold"},"text":"k ","element":"span"},{"text":"is drawn uniformly at random from ","element":"span"},{"style":{"height":16},"width":175.02,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-14.png","element":"img","alt":" Kλ and g","inline":true,"padRight":true},{"text":"is drawn uniformly at random from the set of all functions from ","element":"span"},{"style":{"height":14.84},"width":51.13,"height":37.11,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-15.png","element":"img","alt":" Xλ","inline":true,"padRight":true},{"text":"to ","element":"span"},{"style":{"height":14.84},"width":49.16,"height":37.11,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-16.png","element":"img","alt":" Yλ","inline":true},{"text":". Here, we use ","element":"span"},{"text":"Adv ","element":"span"},{"text":"with a function superscript to mean the action of the algorithm ","element":"span"},{"text":"Adv ","element":"span"},{"text":"when given oracle access to that function. ","element":"span"},{"style":{"height":10.4},"width":34,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-17.png","element":"img","alt":" ◀","inline":true}],[{"text":"Typically, the runtime ","element":"span"},{"style":{"height":17.2},"width":44.39,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-18.png","element":"img","alt":" t(·","inline":true},{"text":") of interest in this definition is taken to be polynomial. However, other choices of ","element":"span"},{"style":{"height":17.6},"width":44.72,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-19.png","element":"img","alt":" t(·","inline":true},{"text":") are possible.","element":"span"}],[{"text":"Next, we formalize the hardness of learning PRFs in the context of online learning:","element":"span"}],[{"id":"id-192","style":{"fontWeight":"bold"},"text":"Theorem 45 ","element":"span"},{"text":"(Hardness of learning PRFs in online learning)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Take the security parameter to be ","element":"span"},{"style":{"height":12},"width":112.16,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-20.png","element":"img","alt":"λ = n","inline":true},{"text":". Let ","element":"span"},{"style":{"height":17.6},"width":248.49,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-21.png","element":"img","alt":" F = {fλ}λ∈N","inline":true,"padRight":true},{"text":"be a PRF that is secure against classical ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"t","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"))-time adversaries. Let ∆ : ","element":"span"},{"style":{"height":12.4},"width":131.44,"height":31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-22.png","element":"img","alt":" N → N","inline":true,"padRight":true},{"text":"be a polynomial and let ","element":"span"},{"style":{"height":15.2},"width":190.02,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-23.png","element":"img","alt":" p : N → N","inline":true,"padRight":true},{"text":"be a function such that ","element":"span"},{"style":{"height":27.46},"width":471.68,"height":68.66,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-24.png","element":"img","alt":" p(n), ln( ∆(n)∆(n)−1) ≤ O(t(n","inline":true},{"text":")). ","element":"span"},{"text":"Suppose ","element":"span"},{"style":{"height":14.8},"width":86.23,"height":37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-25.png","element":"img","alt":" G ⊆","inline":true,"padRight":true},{"text":"[0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1]","element":"span"},{"style":{"height":12.8},"width":95.16,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-26.png","element":"img","alt":"{0,1}n","inline":true,"padRight":true},{"text":"is a function class such that ","element":"span"},{"style":{"height":14.8},"width":144.04,"height":37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-27.png","element":"img","alt":" F ⊆ G","inline":true,"padRight":true},{"text":"and ln (","element":"span"},{"style":{"height":17.6},"width":431.24,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-28.png","element":"img","alt":"NT (G, 1/6, ∞)) ≤ p(n","inline":true},{"text":"). There exists no classical ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"t","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"))-time algorithm for properly online learning ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"with at most 6(","element":"span"},{"style":{"height":28.8},"width":577.1,"height":72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-29.png","element":"img","alt":"p(n) +�ln( ∆(n)∆(n)−1)�) many (1/","inline":true},{"text":"3)-mistakes in an online game with 18(","element":"span"},{"style":{"height":28.8},"width":539.69,"height":72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-30.png","element":"img","alt":"p(n) +�ln( ∆(n)∆(n)−1)�) rounds.","inline":true}],[{"text":"Theorem ","element":"span"},{"href":"#id-192","text":"45 ","element":"a"},{"text":"in particular says: If a hypothesis class of interest has polynomial sequential metric entropies and contains a class of PRFs secure against polynomial-time adversaries, then that class cannot be efficiently online learned with polynomially many ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(1)-mistakes.","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Suppose for contradiction that ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"is an ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"t","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"))-time algorithm for properly online learning ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"with at most ","element":"span"},{"style":{"fontStyle":"italic"},"text":"m","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":") = 6(","element":"span"},{"style":{"fontStyle":"italic"},"text":"p","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":") +","element":"span"},{"style":{"height":28.8},"width":240.63,"height":72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-31.png","element":"img","alt":"�ln( ∆(n)∆(n)−1)�","inline":true},{"text":") many (1","element":"span"},{"style":{"fontStyle":"italic"},"text":"/","element":"span"},{"text":"3)-mistakes in an online learning game with at least ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":") = 3","element":"span"},{"style":{"fontStyle":"italic"},"text":"m","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":") many rounds. Note that ","element":"span"},{"style":{"height":17.6},"width":268.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-32.png","element":"img","alt":" T(n) ≤ O(t(n","inline":true},{"text":")), by our assumptions on ","element":"span"},{"style":{"fontStyle":"italic"},"text":"p ","element":"span"},{"text":"and ∆. We then construct a procedure for distinguishing between a random element of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"F ","element":"span"},{"text":"and a truly random function with success probability ","element":"span"},{"style":{"height":17.6},"width":174.85,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/51-33.png","element":"img","alt":" ≥ 1/∆(n","inline":true},{"text":"). Namely, we define ","element":"span"},{"style":{"fontStyle":"italic"},"text":"D","element":"span"},{"text":", when given query access to a function ","element":"span"},{"style":{"fontStyle":"italic"},"text":"f","element":"span"},{"text":", to act as follows: First, simulate an online learning game between the learner ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"and an adversary that uses an arbitrary sequence of pairwise distinct challenges ","element":"span"},{"style":{"height":17.6},"width":372.32,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-0.png","element":"img","alt":"x1, . . . , xT ∈ {0, 1}n ","inline":true,"padRight":true},{"text":"and the corresponding true values ","element":"span"},{"style":{"height":17.2},"width":291.79,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-1.png","element":"img","alt":" f(x1), . . . , f(xT","inline":true,"padRight":true},{"text":"). Second, if ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"made at most ","element":"span"},{"style":{"fontStyle":"italic"},"text":"m","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":") many (1","element":"span"},{"style":{"fontStyle":"italic"},"text":"/","element":"span"},{"text":"3)-mistakes, ","element":"span"},{"style":{"height":15.6},"width":346.67,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-2.png","element":"img","alt":" D outputs “f ∈ F","inline":true},{"text":"”, otherwise ","element":"span"},{"style":{"height":15.6},"width":255.26,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-3.png","element":"img","alt":" D outputs “f","inline":true,"padRight":true},{"text":"truly random”.","element":"span"}],[{"text":"Let us analyze the success probability of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"D","element":"span"},{"text":". On the one hand, if ","element":"span"},{"style":{"height":16},"width":200.32,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-4.png","element":"img","alt":" f ∈ F ⊆ G","inline":true},{"text":", then the simulated online learning game takes place in a realizable scenario. So ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"makes at most ","element":"span"},{"style":{"fontStyle":"italic"},"text":"p","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":") many (1","element":"span"},{"style":{"fontStyle":"italic"},"text":"/","element":"span"},{"text":"3)-mistakes by assumption, and thus ","element":"span"},{"style":{"fontStyle":"italic"},"text":"D ","element":"span"},{"text":"correctly outputs “","element":"span"},{"style":{"height":15.6},"width":116.44,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-5.png","element":"img","alt":"f ∈ F","inline":true},{"text":"”. On the other hand, suppose ","element":"span"},{"style":{"fontStyle":"italic"},"text":"f ","element":"span"},{"text":"is chosen as a truly random function from ","element":"span"},{"style":{"height":17.6},"width":127.84,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-6.png","element":"img","alt":" {0, 1}n","inline":true,"padRight":true},{"text":"to ","element":"span"},{"style":{"fontStyle":"italic"},"text":"{","element":"span"},{"text":"0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1","element":"span"},{"style":{"fontStyle":"italic"},"text":"}","element":"span"},{"text":". As ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"is assumed to be proper, we know that for any 1 ","element":"span"},{"style":{"height":17.6},"width":246.25,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-7.png","element":"img","alt":" ≤ m ≤ T(n),","inline":true}],[{"style":{"width":"97%"},"width":1833,"height":117,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-8.png","element":"img"}],[{"text":"Next, notice that by the definition of sequential covering with ","element":"span"},{"style":{"height":11.2},"width":124.35,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-9.png","element":"img","alt":" p = ∞","inline":true},{"text":", if we let ","element":"span"},{"style":{"fontStyle":"italic"},"text":"V ","element":"span"},{"text":"= ","element":"span"},{"style":{"fontStyle":"italic"},"text":"V ","element":"span"},{"text":"(","element":"span"},{"style":{"fontWeight":"bold"},"text":"x","element":"span"},{"text":") be a smallest sequential ","element":"span"},{"style":{"height":8.4},"width":44,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-10.png","element":"img","alt":" ∞","inline":true},{"text":"-norm (1","element":"span"},{"style":{"fontStyle":"italic"},"text":"/","element":"span"},{"text":"6)-cover for ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":", where ","element":"span"},{"style":{"fontWeight":"bold"},"text":"x ","element":"span"},{"text":"is a complete rooted binary tree of depth ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":") such that there is a path ","element":"span"},{"style":{"height":17.2},"width":696.13,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-11.png","element":"img","alt":" π with xt(π) = xt for all 1 ≤ t ≤ T(n","inline":true},{"text":"), then for any ","element":"span"},{"style":{"height":19.15},"width":324.72,"height":47.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-12.png","element":"img","alt":" g1, . . . , gT(n) ∈ G,","inline":true,"padRight":true},{"text":"there exists a ","element":"span"},{"style":{"fontWeight":"bold"},"text":"v ","element":"span"},{"style":{"height":17.6},"width":978.3,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-13.png","element":"img","alt":" ∈ V such that |vt(π) − gt(xt)| = |vt(π) − gt(π)| ≤ 1/","inline":true},{"text":"6. So, by the triangle inequality and a union bound,","element":"span"}],[{"style":{"width":"77%"},"width":1446,"height":231,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-14.png","element":"img"}],[{"text":"As the ","element":"span"},{"style":{"height":10.8},"width":189.78,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-15.png","element":"img","alt":" x1, . . . , xT","inline":true,"padRight":true},{"text":"are pairwise distinct and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"f ","element":"span"},{"text":"is a random function, the values ","element":"span"},{"style":{"height":17.2},"width":383.57,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-16.png","element":"img","alt":" f(x1), . . . , f(xT ) are","inline":true,"padRight":true},{"text":"independent Bernoulli random variables, each with parameter 1","element":"span"},{"style":{"fontStyle":"italic"},"text":"/","element":"span"},{"text":"2.","element":"span"}],[{"text":"Hence, for any fixed ","element":"span"},{"style":{"fontWeight":"bold"},"text":"v ","element":"span"},{"text":"and for any 0 ","element":"span"},{"style":{"height":24.12},"width":242,"height":60.31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-17.png","element":"img","alt":" ≤ m ≤ T(n)2","inline":true,"padRight":true},{"text":", the probability that at most ","element":"span"},{"style":{"fontStyle":"italic"},"text":"m ","element":"span"},{"text":"of the ","element":"span"},{"text":"predictions made by ","element":"span"},{"style":{"fontWeight":"bold"},"text":"v ","element":"span"},{"text":"are (1","element":"span"},{"style":{"fontStyle":"italic"},"text":"/","element":"span"},{"text":"6)-mistakes is","element":"span"}],[{"style":{"width":"97%"},"width":1823,"height":126,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-18.png","element":"img"}],[{"text":"where we have used a Chernoff-Hoeffding bound. Plugging this into our previous bound, we see that","element":"span"}],[{"style":{"width":"89%"},"width":1683,"height":470,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-19.png","element":"img"}],[{"text":"where we have used ","element":"span"},{"style":{"fontStyle":"italic"},"text":"T","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":") = 3","element":"span"},{"style":{"fontStyle":"italic"},"text":"m","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":") and our choice of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"m","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"). Thus, in the case of a truly random ","element":"span"},{"style":{"fontStyle":"italic"},"text":"f","element":"span"},{"text":", the distinguisher ","element":"span"},{"style":{"fontStyle":"italic"},"text":"D ","element":"span"},{"text":"correctly outputs “","element":"span"},{"style":{"fontStyle":"italic"},"text":"f ","element":"span"},{"text":"truly random” with probability ","element":"span"},{"style":{"height":24.63},"width":128.44,"height":61.57,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/52-20.png","element":"img","alt":" ≥ 1∆(n)","inline":true},{"text":". As ∆ is by ","element":"span"},{"text":"assumption polynomial, this means that ","element":"span"},{"style":{"fontStyle":"italic"},"text":"D ","element":"span"},{"text":"successfully distinguishes between pseudorandom and random with non-negligible success probability.","element":"span"}],[{"text":"Thus, to complete the proof by contradiction (to the pseudorandomness guarantee required in Definition ","element":"span"},{"href":"#id-193","text":"44","element":"a"},{"text":"), it remains to argue that ","element":"span"},{"style":{"fontStyle":"italic"},"text":"D ","element":"span"},{"text":"runs in time ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"t","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":")). This can be seen as follows. On the one hand, ","element":"span"},{"style":{"fontStyle":"italic"},"text":"D ","element":"span"},{"text":"plays the online learning game. Here, the learning side takes time ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"t","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":")) by assumption, and the adversary side takes time ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"T","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":")), since oracle queries take unit time. On the other hand, ","element":"span"},{"style":{"fontStyle":"italic"},"text":"D ","element":"span"},{"text":"checks how many mistakes the online learner makes, which takes time ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"T","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":")). Thus, the overall time taken by ","element":"span"},{"style":{"height":17.6},"width":665.82,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-0.png","element":"img","alt":" D is O(t(n) + T(n)) ≤ O(t(n)). ■","inline":true}],[{"style":{"fontWeight":"bold"},"text":"Remark 46. ","element":"span"},{"text":"Notice that in the proof of Theorem ","element":"span"},{"href":"#id-192","text":"45","element":"a"},{"text":", the only property of the challenges ","element":"span"},{"style":{"height":10.8},"width":189.78,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-1.png","element":"img","alt":" x1, . . . , xT","inline":true,"padRight":true},{"text":"that mattered to the argument was that they are chosen to be pairwise distinct. Hence, this line of reasoning can be extended to learner-adversary interactions in which the learner, rather than the adversary, actively chooses pairwise distinct inputs to the unknown function.","element":"span"}],[{"text":"We now apply Theorem ","element":"span"},{"href":"#id-192","text":"45 ","element":"a"},{"text":"for our scenario of online learning bounded-complexity quantum channels. To obtain efficiently implementable PRFs, we make a common hardness assumption, namely the hardness of the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"ring learning with errors ","element":"span"},{"text":"(","element":"span"},{"text":"RingLWE","element":"span"},{"text":") problem [","element":"span"},{"href":"#id-85","referenceIndex":85,"text":"85","element":"a"},{"text":"]. Note that despite our online learning problems revolving around function classes coming from quantum physics, the classes and learners under consideration are all classical. Therefore, we only assume classical hardness of ","element":"span"},{"text":"RingLWE","element":"span"},{"text":".","element":"span"}],[{"id":"id-196","style":{"fontWeight":"bold"},"text":"Corollary 47 ","element":"span"},{"text":"(Computational hardness)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Take the security parameter to be ","element":"span"},{"style":{"height":17.9},"width":440.48,"height":44.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-2.png","element":"img","alt":" λ = n. Let CPTPn,G be","inline":true,"padRight":true},{"text":"the class of all ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit channels of gate complexity at most ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G","element":"span"},{"text":". If there is no classical polynomial-time algorithm for solving ","element":"span"},{"text":"RingLWE","element":"span"},{"text":", then already for ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"= ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"polylog(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":")) there exists no (classical) polynomial-time algorithm for properly online learning ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-3.png","element":"img","alt":" CPTPn,G","inline":true,"padRight":true},{"text":"with at most polynomially many (1","element":"span"},{"style":{"fontStyle":"italic"},"text":"/","element":"span"},{"text":"3)-mistakes, even under the promise that all challenges consist of input states and output effect operators given by rank-1 projections on computational basis elements (without an auxiliary system).","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"First, recall from Corollary ","element":"span"},{"href":"#id-82","text":"19 ","element":"a"},{"text":"that the sequential metric entropies of ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-4.png","element":"img","alt":" CPTPn,G","inline":true,"padRight":true},{"text":"satisfy the bound ln ","element":"span"},{"style":{"height":18.7},"width":700.84,"height":46.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-5.png","element":"img","alt":" NT (CPTPn,G, 1/6, ∞) ≤ O(G log(Gn","inline":true},{"text":")), which scales at most polynomially in ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"if ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"does. Thus, to apply Theorem ","element":"span"},{"href":"#id-192","text":"45","element":"a"},{"text":", it remains to argue that ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-6.png","element":"img","alt":" CPTPn,G","inline":true,"padRight":true},{"text":"contains a suitable PRF family. We combine results from prior work to obtain such a PRF from the assumed hardness of ","element":"span"},{"text":"RingLWE","element":"span"},{"text":". Namely, Ref. [","element":"span"},{"href":"#id-194","referenceIndex":133,"text":"133","element":"a"},{"text":", Theorem 5.3] shows that polynomial-time hardness of (decision-)","element":"span"},{"text":"RingLWE ","element":"span"},{"text":"with suitable parameters (see Ref. [","element":"span"},{"href":"#id-85","referenceIndex":85,"text":"85","element":"a"},{"text":", ","element":"span"},{"href":"#id-194","referenceIndex":133,"text":"133","element":"a"},{"text":"] for more context and a formal discussion) gives rise to a PRF family ","element":"span"},{"style":{"height":17.6},"width":289.1,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-7.png","element":"img","alt":" RF = {fλ}λ∈N","inline":true,"padRight":true},{"text":"on ","element":"span"},{"style":{"height":7.6},"width":129.63,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-8.png","element":"img","alt":" m = ω","inline":true},{"text":"(log(","element":"span"},{"style":{"height":12},"width":26,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-9.png","element":"img","alt":"λ","inline":true},{"text":"))-bit inputs that is secure against polynomial-time classical adversaries. ","element":"span"},{"text":"Moreover, as shown in Ref. [","element":"span"},{"href":"#id-195","referenceIndex":134,"text":"134","element":"a"},{"text":", Lemma 3.16], every ","element":"span"},{"style":{"height":15.6},"width":188.75,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-10.png","element":"img","alt":" fλ ∈ RF","inline":true,"padRight":true},{"text":"can be computed by a ","element":"span"},{"style":{"height":15.47},"width":74.58,"height":38.68,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-11.png","element":"img","alt":" TC0 ","inline":true,"padRight":true},{"text":"circuit, that is, by a constant-depth, polynomial-size circuit consisting of AND, OR, NOT, and MAJORITY gates with unbounded fan-in. As shown in Ref. [","element":"span"},{"href":"#id-14","referenceIndex":32,"text":"32","element":"a"},{"text":", Proposition 2], if ","element":"span"},{"style":{"height":17.6},"width":386.19,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-12.png","element":"img","alt":" f : {0, 1}m → {0, 1}","inline":true,"padRight":true},{"text":"can be implemented by a ","element":"span"},{"style":{"height":15.47},"width":74.58,"height":38.68,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-13.png","element":"img","alt":" TC0","inline":true,"padRight":true},{"text":"circuit, then there is a quantum circuit on ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n ","element":"span"},{"text":"= ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(poly(","element":"span"},{"style":{"fontStyle":"italic"},"text":"m","element":"span"},{"text":")) qubits with size ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"polylog(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":")) that implements the unitary ","element":"span"},{"style":{"height":17.64},"width":48.79,"height":44.11,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-14.png","element":"img","alt":" Uf","inline":true,"padRight":true},{"text":"acting as ","element":"span"},{"style":{"height":18.84},"width":436.34,"height":47.11,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-15.png","element":"img","alt":"Uf|x⟩|b⟩ = |x⟩|b⊕f(x)⟩","inline":true,"padRight":true},{"text":"(ignoring auxiliary qubits). Consequently, choosing the security parameter as ","element":"span"},{"style":{"height":12},"width":108.96,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-16.png","element":"img","alt":"λ = n","inline":true},{"text":", every function in ","element":"span"},{"style":{"fontStyle":"italic"},"text":"RF ","element":"span"},{"text":"can be implemented by a unitary quantum circuit of size ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"polylog(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":")), and thus ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"= ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"polylog(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":")) suffices to guarantee the inclusion ","element":"span"},{"style":{"height":17.9},"width":301.16,"height":44.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-17.png","element":"img","alt":" RF ⊆ CPTPn,G","inline":true},{"text":". (Note: Here, we slightly abused notation by not explicitly restricting ","element":"span"},{"style":{"height":17.9},"width":170.3,"height":44.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/53-18.png","element":"img","alt":" CPTPn,G","inline":true,"padRight":true},{"text":"to the input space of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"RF","element":"span"},{"text":", which can be embedded into the Boolean hypercube.)","element":"span"}],[{"text":"Hence, for ","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"= ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"polylog(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":")), we can apply Theorem ","element":"span"},{"href":"#id-192","text":"45 ","element":"a"},{"text":"with ","element":"span"},{"style":{"height":17.9},"width":492.24,"height":44.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/54-0.png","element":"img","alt":" F = RF, G = CPTPn,G","inline":true,"padRight":true},{"text":"(again not writing out the restriction of the input space), ∆(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":") = 3, ","element":"span"},{"style":{"fontStyle":"italic"},"text":"p","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":") = ","element":"span"},{"style":{"fontStyle":"italic"},"text":"O","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"G ","element":"span"},{"text":"log(","element":"span"},{"style":{"fontStyle":"italic"},"text":"Gn","element":"span"},{"text":")), and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"t","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":") = poly(","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"). This yields the claimed result. ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/54-1.png","element":"img","alt":"■","inline":true}],[{"text":"As can be seen in the proof of Corollary ","element":"span"},{"href":"#id-196","text":"47","element":"a"},{"text":", the hardness assumption on ","element":"span"},{"text":"RingLWE ","element":"span"},{"text":"is “only” used to obtain a PRF class implementable by relatively small circuits. Therefore, one may replace this widely believed assumption about the hardness of a concrete problem [","element":"span"},{"href":"#id-41","referenceIndex":52,"text":"52","element":"a"},{"text":"–","element":"span"},{"href":"#id-42","referenceIndex":54,"text":"54","element":"a"},{"text":"] by a more abstract cryptographic assumption on the existence of PRFs implementable by small circuits, and the above line of reasoning can still be applied.","element":"span"}]]},{"heading":"6 Shadow tomography of quantum processes","paragraphs":[[{"text":"For quantum states, shadow tomography [","element":"span"},{"href":"#id-7","referenceIndex":16,"text":"16","element":"a"},{"text":", ","element":"span"},{"href":"#id-26","referenceIndex":17,"text":"17","element":"a"},{"text":", ","element":"span"},{"href":"#id-8","referenceIndex":19,"text":"19","element":"a"},{"text":"] is the task of using few copies of an unknown state to predict the expectation values of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"M ","element":"span"},{"text":"effect operators, which may be chosen adaptively/adversarially. In this section, we consider the analogous problem for quantum processes, starting with quantum channels and then going to multi-time processes.","element":"span"}],[{"id":"id-43","style":{"fontWeight":"bold"},"text":"Problem 3 ","element":"span"},{"text":"(Shadow tomography of quantum channels)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":16.7},"width":313.06,"height":41.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/54-2.png","element":"img","alt":" NA→B ∈ CPTPn","inline":true,"padRight":true},{"text":"be an (unknown) ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit channel, and let ","element":"span"},{"style":{"height":16},"width":113.98,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/54-3.png","element":"img","alt":" ε, δ >","inline":true,"padRight":true},{"text":"0. When sequentially presented with any adversarially chosen sequence of two-outcome test operators, ","element":"span"},{"style":{"height":26.85},"width":632.37,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/54-4.png","element":"img","alt":" E(1)A,B, E(2)A,B, . . . , E(M)A,B, for M ∈ N","inline":true},{"text":", return quantities ","element":"span"},{"style":{"height":14.62},"width":117.58,"height":36.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/54-5.png","element":"img","alt":" bi ∈ R","inline":true,"padRight":true},{"text":"such that ","element":"span"},{"style":{"height":26.85},"width":449.92,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/54-6.png","element":"img","alt":" |bi − Tr[E(i)A,BCNA,B]| ≤ ε","inline":true,"padRight":true},{"text":"for all ","element":"span"},{"style":{"height":17.6},"width":319.72,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/54-7.png","element":"img","alt":" i ∈ {1, 2, . . . , M}","inline":true,"padRight":true},{"text":"with probability at least 1 ","element":"span"},{"style":{"height":12.8},"width":63.64,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/54-8.png","element":"img","alt":" − δ","inline":true},{"text":". Do this ","element":"span"},{"text":"by querying the channel ","element":"span"},{"style":{"fontStyle":"italic"},"text":"k ","element":"span"},{"text":"times (adaptively or in parallel), with ","element":"span"},{"style":{"fontStyle":"italic"},"text":"k ","element":"span"},{"text":"being as small as possible. ","element":"span"},{"style":{"height":10.4},"width":34,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/54-9.png","element":"img","alt":" ◀","inline":true}],[{"text":"Embedding classical functions into quantum channels similarly to Section ","element":"span"},{"href":"#id-29","text":"4.1","element":"a"},{"text":", one can see that in the case of general quantum channels, no non-trivial shadow tomography strategy—achieving a query complexity that is simultaneously sublinear in ","element":"span"},{"style":{"fontStyle":"italic"},"text":"M ","element":"span"},{"text":"and polynomial in ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"—is possible. Therefore, we again have to consider restricted classes of channels. We primarily focus on Pauli channels and Pauli multi-time processes, for which we introduce a shadow tomography scheme via classical adaptive data analysis [","element":"span"},{"href":"#id-44","referenceIndex":55,"text":"55","element":"a"},{"text":", ","element":"span"},{"href":"#id-45","referenceIndex":56,"text":"56","element":"a"},{"text":"] that requires few measurements (of the Choi state) of the unknown process.","element":"span"}],[{"id":"id-199","style":{"fontWeight":"bold"},"text":"Theorem 48 ","element":"span"},{"text":"(Shadow tomography of Pauli channels)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"There exists an explicit strategy that solves Problem ","element":"span"},{"href":"#id-43","text":"3 ","element":"a"},{"text":"for any ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit Pauli channel using","element":"span"}],[{"id":"id-198","style":{"width":"67%"},"width":1273,"height":120,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/54-10.png","element":"img"}],[{"text":"copies of the channel. The strategy runs in time poly(4","element":"span"},{"style":{"height":15.2},"width":64.96,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/54-11.png","element":"img","alt":"n, k","inline":true},{"text":") per query.","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Let ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"B ","element":"span"},{"text":"be ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit systems, and consider a Pauli channel ","element":"span"},{"style":{"height":14.7},"width":115.59,"height":36.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/54-12.png","element":"img","alt":" PA→B","inline":true,"padRight":true},{"text":"as in (","element":"span"},{"href":"#id-197","text":"2.11","element":"a"},{"text":"), with error-rate vector ","element":"span"},{"style":{"height":19.95},"width":358.86,"height":49.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/54-13.png","element":"img","alt":" p = (pz,x)z,x∈{0,1}n","inline":true},{"text":". Consider also test operators ","element":"span"},{"style":{"height":26.85},"width":603.65,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/54-14.png","element":"img","alt":" E(i)A,B, for i ∈ {1, 2, . . . , M}. For","inline":true,"padRight":true},{"text":"every such operator, we have","element":"span"}],[{"style":{"width":"80%"},"width":1512,"height":107,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/54-15.png","element":"img"}],[{"text":"where we have defined ","element":"span"},{"style":{"height":26.85},"width":380.64,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-0.png","element":"img","alt":" e(i)z,x := Tr[E(i)A,BΓz,xA,B","inline":true},{"text":"]. From this, we can see that every desired expectation ","element":"span"},{"text":"value is exactly the expectation value of the function (","element":"span"},{"style":{"height":26.85},"width":536.95,"height":67.14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-1.png","element":"img","alt":"z, x) �→ e(i)z,x = Tr[E(i)A,BΓz,xA,B","inline":true},{"text":"] with respect ","element":"span"},{"text":"to the error-rate probability distribution ","element":"span"},{"style":{"fontStyle":"italic","fontWeight":"bold"},"text":"p ","element":"span"},{"text":"of the unknown Pauli channel ","element":"span"},{"style":{"height":14.7},"width":115.6,"height":36.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-2.png","element":"img","alt":" PA→B","inline":true},{"text":". Now, we can obtain samples from the error-rate distribution by performing Bell measurements on the Choi state. ","element":"span"},{"text":"Specifically, to obtain one sample, we prepare a (2","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":")-qubit maximally-entangled state Φ","element":"span"},{"href":"#id-113","style":{"height":18.93},"width":548.28,"height":47.32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-3.png","element":"img","alt":"A,A′ = |Φ⟩⟨Φ|A,A′ (recall (2.4)","inline":true},{"text":"), send the ","element":"span"},{"style":{"height":12.8},"width":41.73,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-4.png","element":"img","alt":" A′ ","inline":true,"padRight":true},{"text":"system through the channel, and then measure systems ","element":"span"},{"style":{"fontStyle":"italic"},"text":"A ","element":"span"},{"text":"and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"B ","element":"span"},{"text":"with respect to the Bell basis POVM ","element":"span"},{"style":{"height":19.95},"width":293.76,"height":49.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-5.png","element":"img","alt":" {Φz,x}z,x∈{0,1}n","inline":true},{"text":". Note that this indeed amounts to measuring the Choi state of the channel with respect to the Bell basis POVM. Then, using the definition of the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit Bell states in (","element":"span"},{"href":"#id-113","text":"2.4","element":"a"},{"text":"), the probability of obtaining an outcome (","element":"span"},{"style":{"fontStyle":"italic","fontWeight":"bold"},"text":"z","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"style":{"fontStyle":"italic","fontWeight":"bold"},"text":"x","element":"span"},{"text":") is given by","element":"span"}],[{"style":{"width":"81%"},"width":1532,"height":252,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-6.png","element":"img"}],[{"text":"for all ","element":"span"},{"style":{"height":17.6},"width":268.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-7.png","element":"img","alt":" z, x ∈ {0, 1}n.","inline":true}],[{"text":"We can now combine (","element":"span"},{"href":"#id-198","text":"6.2","element":"a"},{"text":") with the ability to sample from the error-rate distribution obtained by Bell measurements to make use of known results in ","element":"span"},{"style":{"fontStyle":"italic"},"text":"classical ","element":"span"},{"text":"adaptive data analysis [","element":"span"},{"href":"#id-44","referenceIndex":55,"text":"55","element":"a"},{"text":", ","element":"span"},{"href":"#id-45","referenceIndex":56,"text":"56","element":"a"},{"text":"]. In classical data analysis, the goal is to answer a sequence of adaptively chosen queries ","element":"span"},{"style":{"height":10.8},"width":247.66,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-8.png","element":"img","alt":" q1, q2, . . . , qM","inline":true,"padRight":true},{"text":"with answers ","element":"span"},{"style":{"height":15.2},"width":245.4,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-9.png","element":"img","alt":" b1, b2, . . . , bM","inline":true},{"text":", such that ","element":"span"},{"style":{"height":17.6},"width":894.64,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-10.png","element":"img","alt":" |bi − qi(p)| ≤ ε for all i ∈ {1, 2, . . . , M}, given k","inline":true,"padRight":true},{"text":"samples from the underlying (unknown) probability distribution ","element":"span"},{"style":{"fontStyle":"italic","fontWeight":"bold"},"text":"p","element":"span"},{"text":". This setting precisely matches our setting of Pauli channel shadow tomography, by recognizing that the underlying distribution ","element":"span"},{"style":{"fontStyle":"italic","fontWeight":"bold"},"text":"p ","element":"span"},{"text":"can be taken to be the error-rate distribution of the unknown Pauli channel, and the queries ","element":"span"},{"style":{"height":10.8},"width":31.48,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-11.png","element":"img","alt":" qi","inline":true,"padRight":true},{"text":"can be taken to be ","element":"span"},{"style":{"height":25.55},"width":747.29,"height":63.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-12.png","element":"img","alt":" qi(p) ≡ E(z,x)∼p[e(i)z,x], i ∈ {1, 2, . . . , M}","inline":true},{"text":". Then, in the regime ","element":"span"},{"style":{"height":13.2},"width":137.98,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-13.png","element":"img","alt":" M ≫ k","inline":true},{"text":", we make direct use of ","element":"span"},{"text":"Ref. [","element":"span"},{"href":"#id-45","referenceIndex":56,"text":"56","element":"a"},{"text":", Corollary 6.3] to obtain our desired result. ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-14.png","element":"img","alt":"■","inline":true}],[{"text":"We highlight that Theorem ","element":"span"},{"href":"#id-199","text":"48 ","element":"a"},{"text":"for restricted/approximate shadow tomography of Pauli channels can be used to perform shadow tomography of arbitrary channels, essentially by applying Theorem ","element":"span"},{"href":"#id-199","text":"48 ","element":"a"},{"text":"to the Pauli twirled version of the channel. The upshot is that our bounds scale with the diamond norm distance between the unknown channel and its corresponding Pauli twirled version.","element":"span"}],[{"id":"id-47","style":{"fontWeight":"bold"},"text":"Corollary 49 ","element":"span"},{"text":"(Shadow tomography of arbitrary quantum channels)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":16.62},"width":246.81,"height":41.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-15.png","element":"img","alt":" N ∈ CPTPn","inline":true,"padRight":true},{"text":"be an arbitrary quantum channel. There exists an explicit strategy that solves Problem ","element":"span"},{"href":"#id-43","text":"3 ","element":"a"},{"text":"for ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":"using","element":"span"}],[{"style":{"width":"76%"},"width":1442,"height":123,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-16.png","element":"img"}],[{"text":"copies of ","element":"span"},{"style":{"height":18.33},"width":259.32,"height":45.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-17.png","element":"img","alt":" N, where N P ","inline":true,"padRight":true},{"text":"is the Pauli twirled version of ","element":"span"},{"style":{"height":21.29},"width":482.65,"height":53.23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-18.png","element":"img","alt":" N and ε > 12∥N − N P∥⋄.","inline":true}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"We perform a Bell measurement on the Choi state of the unknown channel. The measurement probabilities define the error-rate vector of the Pauli-twirled version of the channel, based on the developments in Appendix ","element":"span"},{"text":"C","element":"span"},{"text":". Then, we make use of the triangle inequality as follows, for an arbitrary ","element":"span"},{"style":{"height":12.8},"width":104.06,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-19.png","element":"img","alt":"b ∈ R","inline":true,"padRight":true},{"text":"and arbitrary channel test operator, to get","element":"span"}],[{"style":{"width":"90%"},"width":1692,"height":75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/55-20.png","element":"img"}],[{"style":{"width":"60%"},"width":1141,"height":202,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-0.png","element":"img"}],[{"text":"where in the final equality we made use of the fact that (see, e.g., [","element":"span"},{"href":"#id-200","referenceIndex":135,"text":"135","element":"a"},{"text":", Section 4])","element":"span"}],[{"style":{"width":"91%"},"width":1710,"height":287,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-1.png","element":"img"}],[{"text":"for arbitrary channels ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":"and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"M","element":"span"},{"text":". Finally, as we assume ","element":"span"},{"style":{"height":21.29},"width":371.08,"height":53.23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-2.png","element":"img","alt":" ε − 12∥N − N P∥⋄ >","inline":true,"padRight":true},{"text":"0, can run the Pauli ","element":"span"},{"text":"channel shadow tomography from Theorem ","element":"span"},{"href":"#id-199","text":"48 ","element":"a"},{"text":"with accuracy parameter ˜","element":"span"},{"style":{"height":21.29},"width":517.19,"height":53.23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-3.png","element":"img","alt":"ε = ε − 12∥N − N P∥⋄ to get","inline":true,"padRight":true},{"text":"the desired approximation guarantee","element":"span"},{"style":{"height":29.53},"width":438.43,"height":73.82,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-4.png","element":"img","alt":"��b − Tr[EA,BCNA,B]�� ≤ ε","inline":true,"padRight":true},{"text":"with the claimed sample complexity bounds. ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-5.png","element":"img","alt":"■","inline":true}],[{"id":"id-106","style":{"fontWeight":"bold"},"text":"6.1 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Shadow tomography of multi-time processes","element":"span"}],[{"text":"By analogy with the shadow tomography problem for quantum channels, we can formulate the problem of shadow tomography for multi-time quantum processes.","element":"span"}],[{"id":"id-61","style":{"fontWeight":"bold"},"text":"Problem 4 ","element":"span"},{"text":"(Shadow tomography of multi-time quantum processes)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":17.6},"width":282.86,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-6.png","element":"img","alt":" r ∈ {1, 2, . . . }","inline":true},{"text":", let ","element":"span"},{"style":{"height":15.02},"width":235.42,"height":37.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-7.png","element":"img","alt":"N ∈ COMBr","inline":true,"padRight":true},{"text":"be a comb operator corresponding to a multi-time quantum process with ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r ","element":"span"},{"text":"time steps, and let ","element":"span"},{"style":{"height":16},"width":106.87,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-8.png","element":"img","alt":" ε, δ >","inline":true,"padRight":true},{"text":"0. When sequentially presented with any adversarially chosen sequence of two-outcome multi-time test operators ","element":"span"},{"style":{"height":19.14},"width":373.19,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-9.png","element":"img","alt":" E(1), E(2), . . . , E(M)","inline":true},{"text":", for ","element":"span"},{"style":{"height":17.6},"width":291.75,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-10.png","element":"img","alt":" M ∈ {1, 2, . . . }","inline":true},{"text":", return quantities ","element":"span"},{"style":{"height":14.62},"width":123.61,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-11.png","element":"img","alt":" bi ∈ R","inline":true,"padRight":true},{"text":"such that ","element":"span"},{"style":{"height":17.6},"width":790.96,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-12.png","element":"img","alt":" |bi − Tr[EN]| ≤ ε for all i ∈ {1, 2, . . . , M}","inline":true,"padRight":true},{"text":"with probability at least 1 ","element":"span"},{"style":{"height":12.8},"width":63.65,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-13.png","element":"img","alt":" − δ","inline":true},{"text":". Do this by querying the process ","element":"span"},{"style":{"fontStyle":"italic"},"text":"k ","element":"span"},{"text":"times (adaptively or in parallel), with ","element":"span"},{"style":{"fontStyle":"italic"},"text":"k ","element":"span"},{"text":"being as small as possible. ","element":"span"},{"style":{"height":10.4},"width":34,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-14.png","element":"img","alt":" ◀","inline":true}],[{"text":"By following arguments similar to those in the proof of Corollary ","element":"span"},{"href":"#id-47","text":"49","element":"a"},{"text":", we can prove a shadow tomography result for arbitrary multi-time quantum processes as follows. In particular, this involves introducing the idea of Pauli-twirling a multi-time process, which we illustrate in Figure ","element":"span"},{"href":"#id-201","text":"4","element":"a"},{"text":"(a).","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Corollary 50. ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":17.6},"width":259.4,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-15.png","element":"img","alt":" r ∈ {1, 2, . . . }","inline":true},{"text":", let ","element":"span"},{"style":{"height":15.02},"width":235.42,"height":37.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-16.png","element":"img","alt":" N ∈ COMBr","inline":true,"padRight":true},{"text":"be a comb operator corresponding to a multi-time quantum process with ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r ","element":"span"},{"text":"time steps. There exists an explicit strategy that solves Problem ","element":"span"},{"href":"#id-61","text":"4 ","element":"a"},{"text":"for","element":"span"}],[{"style":{"width":"99%"},"width":1872,"height":164,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-17.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":15.13},"width":61.82,"height":37.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-18.png","element":"img","alt":" NP ","inline":true,"padRight":true},{"text":"is the comb operator corresponding to the Pauli-twirled version of the multi-time process (see Figure ","element":"span"},{"href":"#id-201","text":"4","element":"a"},{"text":"(a) for a depiction), and ","element":"span"},{"style":{"height":21.29},"width":352.42,"height":53.23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/56-19.png","element":"img","alt":" ε > 12∥N − NP∥⋄r.","inline":true}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"We proceed by performing “time-local” Bell measurements on the multi-time process; see Figure ","element":"span"},{"href":"#id-201","text":"4","element":"a"},{"text":"(b). This means that, for every time step, we prepare a Bell state, send one-half of it","element":"span"}],[{"style":{"height":15.2},"width":1296.58,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/57-0.png","element":"img","alt":"N 1 N 2 N 3","inline":true},{"id":"id-201","style":{"height":21.09},"width":1752.05,"height":52.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/57-1.png","element":"img","alt":"Pw1 Pw1† Pw2 Pw2† Pw3 Pw3†A1 A1 B1 B1 A2 A2 B2B2 A3 A3 B3B3","inline":true}],[{"style":{"width":"88%"},"width":1658,"height":578,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/57-2.png","element":"img"}],[{"text":"Figure 4: ","element":"figcaption","subtype":"caption"},{"style":{"fontWeight":"bold"},"text":"Twirling of multi-time quantum processes. ","element":"figcaption","subtype":"caption"},{"text":"(a) A “time-local” Pauli twirl of a multi-time quantum process with ","element":"figcaption","subtype":"caption"},{"style":{"fontStyle":"italic"},"text":"r ","element":"figcaption","subtype":"caption"},{"text":"time steps consists of independently applying a random Pauli channel ","element":"figcaption","subtype":"caption"},{"style":{"height":17.39},"width":1098.9,"height":43.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/57-3.png","element":"img","alt":" Pwk(·) := P wk(·)P wk†, where wk ≡ (zk, xk) ∈ {0, 1}n × {0, 1}n","inline":true},{"text":", to the input and output of every time step ","element":"figcaption","subtype":"caption"},{"style":{"height":16},"width":275.04,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/57-4.png","element":"img","alt":" k ∈ {1, 2, . . . , r}","inline":true},{"text":". (b) After twirling, the process is characterized by an error-rate probability vector, in the same way as Pauli channels. This error-rate vector can be obtained via time-local Bell measurements, as shown. The outcomes of the measurements are ","element":"figcaption","subtype":"caption"},{"style":{"height":16},"width":835.9,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/57-5.png","element":"img","alt":"w1 ≡ (z1, x1), w2 ≡ (z2, x2), . . . , wr ≡ (zr, xr).","inline":true}],[{"text":"through the process, and then measure the output system and the other-half of the Bell state in the (2","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":")-qubit Bell basis. Doing this once for each time step leads to measurement outcomes ","element":"span"},{"style":{"height":11.02},"width":102.56,"height":27.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/57-6.png","element":"img","alt":" w1 ≡","inline":true,"padRight":true},{"text":"(","element":"span"},{"style":{"height":17.6},"width":748.03,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/57-7.png","element":"img","alt":"z1, x1), w2 ≡ (z2, x2), . . . , wr ≡ (zr, xr","inline":true},{"text":"). The probability of any such collection of measurement outcomes is given by","element":"span"}],[{"style":{"width":"94%"},"width":1768,"height":90,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/57-8.png","element":"img"}],[{"text":"which is due to the fact that applying one-half of a maximally-entangled state to every input of the process defines the Choi state of the process, which is equal to ","element":"span"},{"style":{"height":21.29},"width":95.75,"height":53.23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/57-9.png","element":"img","alt":"12nr N","inline":true},{"text":"; see Appendix ","element":"span"},{"text":"B","element":"span"},{"text":".","element":"span"}],[{"text":"Let us now consider the Pauli-twirl of the process, as depicted in Figure ","element":"span"},{"href":"#id-201","text":"4","element":"a"},{"text":"(a). By combining Lemma ","element":"span"},{"href":"#id-202","text":"58 ","element":"a"},{"text":"and Proposition ","element":"span"},{"href":"#id-203","text":"59","element":"a"},{"text":", we find that the comb operator ","element":"span"},{"style":{"height":15.13},"width":61.82,"height":37.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/57-10.png","element":"img","alt":" NP ","inline":true,"padRight":true},{"text":"for the Pauli-twirled process is equal to","element":"span"}],[{"style":{"width":"97%"},"width":1818,"height":376,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/57-11.png","element":"img"}],[{"text":"We now observe that the comb operator for the Pauli-twirled process can be thought of as simply the Choi representation of an (","element":"span"},{"style":{"fontStyle":"italic"},"text":"nr","element":"span"},{"text":")-qubit Pauli channel. As such, we can apply Theorem ","element":"span"},{"href":"#id-199","text":"48","element":"a"},{"text":". Furthermore,","element":"span"}],[{"text":"for an arbitrary ","element":"span"},{"style":{"height":12.8},"width":104.06,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/58-0.png","element":"img","alt":" b ∈ R","inline":true,"padRight":true},{"text":"and an arbitrary test operator ","element":"span"},{"style":{"fontStyle":"italic"},"text":"E","element":"span"},{"text":", we have","element":"span"}],[{"style":{"width":"78%"},"width":1474,"height":280,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/58-1.png","element":"img"}],[{"text":"where for the final equality we made use of the fact that","element":"span"}],[{"style":{"width":"85%"},"width":1598,"height":216,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/58-2.png","element":"img"}],[{"text":"for arbitrary ","element":"span"},{"style":{"height":15.6},"width":299.6,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/58-3.png","element":"img","alt":" N, M ∈ COMBr","inline":true},{"text":". To conclude, we again invoke the Pauli channel shadow tomography of Theorem ","element":"span"},{"href":"#id-199","text":"48 ","element":"a"},{"style":{"height":8},"width":21,"height":20,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/58-4.png","element":"img","alt":" ε","inline":true,"padRight":true},{"text":"with accuracy ","element":"span"},{"style":{"height":21.29},"width":385.21,"height":53.23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/58-5.png","element":"img","alt":" ε − 12∥N − NP∥⋄r >","inline":true,"padRight":true},{"text":"0 to obtain the desired approximation error ","element":"span"},{"style":{"height":17.6},"width":319.46,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/58-6.png","element":"img","alt":"|b − Tr[EN]| ≤ ε","inline":true,"padRight":true},{"text":"with the claimed sample complexity bound. ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/58-7.png","element":"img","alt":"■","inline":true}]]},{"heading":"Bibliography","paragraphs":[[{"id":"id-0","text":"[1] ","element":"span"},{"text":"J. Eisert, D. Hangleiter, N. Walk, I. Roth, D. Markham, R. Parekh, U. Chabaud, and E. Kashefi. Quantum certification and benchmarking. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Nature Reviews Physics","element":"span"},{"text":", 2:382–390, 2020. doi: 10.1038/s42254-020-0186-4.","element":"span"}],[{"text":"[2] Z. Hradil. ","element":"span"},{"text":"Quantum-state estimation. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review A","element":"span"},{"text":", 55:R1561–R1564, 1997. ","element":"span"},{"text":"doi: 10.1103/PhysRevA.55.R1561.","element":"span"}],[{"text":"[3] G. Mauro D’Ariano, M. G.A. Paris, and M. F. Sacchi. Quantum tomography. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Advances in Imaging and Electron Physics","element":"span"},{"text":", 128:205–308, 2003. doi: 10.1016/S1076-5670(03)80065-4.","element":"span"}],[{"text":"[4] D. Gross, Y.-K. Liu, S. T. Flammia, S. Becker, and J. Eisert. Quantum state tomography via compressed sensing. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review Letters","element":"span"},{"text":", 105:150401, 2010. doi: 10.1103/PhysRevLett. 105.150401.","element":"span"}],[{"text":"[5] R. Blume-Kohout. Optimal, reliable estimation of quantum states. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"New Journal of Physics","element":"span"},{"text":", 12(4):043034, 2010. doi: 10.1088/1367-2630/12/4/043034.","element":"span"}],[{"id":"id-1","text":"[6] ","element":"span"},{"text":"K. Banaszek, M. Cramer, and D. Gross. Focus on quantum tomography. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"New Journal of Physics","element":"span"},{"text":", 15(12):125020, 2013. doi: 10.1088/1367-2630/15/12/125020.","element":"span"}],[{"id":"id-2","text":"[7] ","element":"span"},{"text":"I. L. Chuang and M. A. Nielsen. Prescription for experimental determination of the dynamics of a quantum black box. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Journal of Modern Optics","element":"span"},{"text":", 44(11–12):2455–2467, 1997. doi: 10.1080/ 09500349708231894.","element":"span"}],[{"text":"[8] G. M. D’Ariano and P. Lo Presti. Quantum tomography for measuring experimentally the matrix elements of arbitrary quantum operation. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review Letters","element":"span"},{"text":", 86:4195–4198, 2001. doi: 10.1103/PhysRevLett.86.4195.","element":"span"}],[{"id":"id-3","text":"[9] ","element":"span"},{"text":"M. Mohseni, A. T. Rezakhani, and D. A. Lidar. Quantum-process tomography: Resource analysis of different strategies. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review A","element":"span"},{"text":", 77:032322, 2008. doi: 10.1103/PhysRevA. 77.032322.","element":"span"}],[{"id":"id-4","text":"[10] ","element":"span"},{"text":"J. Haah, A. W. Harrow, Z. Ji, X. Wu, and N. Yu. Sample-optimal tomography of quantum states. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"IEEE Transactions on Information Theory","element":"span"},{"text":", 63(9):5628–5641, 2017. doi: 10.1109/TIT. 2017.2719044.","element":"span"}],[{"text":"[11] R. O’Donnell and J. Wright. Efficient quantum tomography. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","element":"span"},{"text":", pages 899–912, 2016. doi: 10.1145/2897518. 2897544.","element":"span"}],[{"text":"[12] S. Chen, J. Li, Brice Huang, and A. Liu. Tight bounds for quantum state certification with incoherent measurements. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS)","element":"span"},{"text":", pages 1205–1213. IEEE, 2022. doi: 10.1109/FOCS54457.2022.00118.","element":"span"}],[{"id":"id-93","text":"[13] ","element":"span"},{"text":"J. Haah, R. Kothari, R. O’Donnell, and E. Tang. Query-optimal estimation of unitary channels in diamond distance. 2023. arXiv:2302.14066.","element":"span"}],[{"id":"id-5","text":"[14] ","element":"span"},{"text":"A. Oufkir. ","element":"span"},{"text":"Sample-optimal quantum process tomography with non-adaptive incoherent measurements. 2023. arXiv:2301.12925.","element":"span"}],[{"id":"id-6","text":"[15] ","element":"span"},{"text":"S. Aaronson. ","element":"span"},{"text":"The learnability of quantum states. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences","element":"span"},{"text":", 463(2088):3089–3114, 2007. doi: 10.1098/ rspa.2007.0113.","element":"span"}],[{"id":"id-7","text":"[16] ","element":"span"},{"text":"S. Aaronson. Shadow tomography of quantum states. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing","element":"span"},{"text":", STOC 2018, pages 325–338, New York, NY, USA, 2018. Association for Computing Machinery. doi: 10.1145/3188745.3188802.","element":"span"}],[{"id":"id-26","text":"[17] ","element":"span"},{"text":"C. Bădescu and R. O’Donnell. Improved quantum data analysis. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"TheoretiCS","element":"span"},{"text":", Volume 3, 2024. doi: 10.46298/theoretics.24.7.","element":"span"}],[{"text":"[18] H.-Y. Huang, R. Kueng, and J. Preskill. Information-theoretic bounds on quantum advantage in machine learning. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review Letters","element":"span"},{"text":", 126:190505, 2021. doi: 10.1103/PhysRevLett. 126.190505.","element":"span"}],[{"id":"id-8","text":"[19] ","element":"span"},{"text":"R. King, D. Gosset, R. Kothari, and R. Babbush. Triply efficient shadow tomography. 2024. arXiv:2404.19211.","element":"span"}],[{"id":"id-9","text":"[20] ","element":"span"},{"text":"S. Aaronson, X. Chen, E. Hazan, S. Kale, and A. Nayak. ","element":"span"},{"text":"Online learning of quantum states. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Journal of Statistical Mechanics: Theory and Experiment","element":"span"},{"text":", 2019(12):124019, 2019. doi: 10.1088/1742-5468/ab3988.","element":"span"}],[{"id":"id-10","text":"[21] ","element":"span"},{"text":"X. Chen, E. Hazan, T. Li, Z. Lu, X. Wang, and R. Yang. Adaptive online learning of quantum states. 2022. arXiv:2206.00220.","element":"span"}],[{"id":"id-11","text":"[22] ","element":"span"},{"text":"H.-Y. Huang, R. Kueng, and J. Preskill. Predicting many properties of a quantum system from very few measurements. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Nature Physics","element":"span"},{"text":", 16:1050–1057, 2020.","element":"span"}],[{"text":"[23] A. Elben, S. T. Flammia, H.-Y. Huang, R. Kueng, J. Preskill, B. Vermersch, and P. Zoller. The randomized measurement toolbox. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Nature Reviews Physics","element":"span"},{"text":", 5:9–24, 2023.","element":"span"}],[{"id":"id-12","text":"[24] ","element":"span"},{"text":"M. Ohliger, V. Nesme, and J. Eisert. Efficient and feasible state tomography of quantum many-body systems. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"New Journal of Physics","element":"span"},{"text":", 15(1):015024, 2013. doi: 10.1088/1367-2630/15/ 1/015024.","element":"span"}],[{"id":"id-13","text":"[25] ","element":"span"},{"text":"H.-Y. Huang, M. Broughton, J. Cotler, S. Chen, J. Li, Masoud Mohseni, Hartmut Neven, Ryan Babbush, R. Kueng, John Preskill, and Jarrod R. McClean. Quantum advantage in learning from experiments. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Science","element":"span"},{"text":", 376(6598):1182–1186, 2022. doi: 10.1126/science.abn7293.","element":"span"}],[{"text":"[26] R. Levy, D. Luo, and R. K. Clark. Classical shadows for quantum process tomography on near-term quantum computers. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review Research","element":"span"},{"text":", 6(1):013029, 2024. doi: 10.1103/ PhysRevResearch.6.013029.","element":"span"}],[{"text":"[27] J. Kunjummen, M. C. Tran, D. Carney, and J. M. Taylor. Shadow process tomography of quantum channels. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review A","element":"span"},{"text":", 107:042403, 2023. doi: 10.1103/PhysRevA.107.042403.","element":"span"}],[{"id":"id-92","text":"[28] ","element":"span"},{"text":"Hsin-Yuan Huang, Sitan Chen, and John Preskill. Learning to predict arbitrary quantum processes. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"PRX Quantum","element":"span"},{"text":", 4(4):040337, 2023. doi: 10.1103/PRXQuantum.4.040337.","element":"span"}],[{"id":"id-77","text":"[29] ","element":"span"},{"text":"M. C. Caro. Learning quantum processes and Hamiltonians via the Pauli transfer matrix. 2022. arXiv:2212.04471.","element":"span"}],[{"text":"[30] A. Angrisani. Learning unitaries with quantum statistical queries. 2023. arXiv:2310.02254.","element":"span"}],[{"text":"[31] C. Wadhwa and M. Doosti. Learning quantum processes with quantum statistical queries. 2023. arXiv:2310.02075.","element":"span"}],[{"id":"id-14","text":"[32] ","element":"span"},{"text":"H. Zhao, L. Lewis, I. Kannan, Y. Quek, H.-Y. Huang, and M. C. Caro. Learning quantum states and unitaries of bounded gate complexity. 2023. arXiv:2310.19882.","element":"span"}],[{"id":"id-16","text":"[33] ","element":"span"},{"text":"M. Riebe, K. Kim, P. Schindler, T. Monz, P. O. Schmidt, T. K. Körber, W. Hänsel, H. Häffner, C. F. Roos, and R. Blatt. Process tomography of ion trap quantum gates. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Phys. Rev. Lett.","element":"span"},{"text":", 97:220407, 2006. doi: 10.1103/PhysRevLett.97.220407.","element":"span"}],[{"text":"[34] R. C. Bialczak, M. Ansmann, M. Hofheinz, E. Lucero, M. Neeley, A. D. O’Connell, D. Sank, H. Wang, J. Wenner, M. Steffen, A. N. Cleland, and J. M. Martinis. Quantum process tomography of a universal entangling gate implemented with Josephson phase qubits. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Nature Physics","element":"span"},{"text":", 6:409–413, 2010. doi: 10.1038/nphys1639.","element":"span"}],[{"id":"id-17","text":"[35] ","element":"span"},{"text":"J. Helsen, M. Ioannou, J. Kitzinger, E. Onorati, A. H. Werner, J. Eisert, and I. Roth. Estimating gate-set properties from random sequences. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Nature Communications","element":"span"},{"text":", 14:5039, 2023. doi: 10.1038/s41467-023-39382-9.","element":"span"}],[{"id":"id-18","text":"[36] ","element":"span"},{"text":"L. G. Valiant. A theory of the learnable. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Communications of the ACM","element":"span"},{"text":", 27(11):1134–1142, 1984. ISSN 00010782. doi: 10.1145/1968.1972.","element":"span"}],[{"id":"id-20","text":"[37] ","element":"span"},{"text":"N. Littlestone. Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Machine Learning","element":"span"},{"text":", 2(4):285–318, 1988. ISSN 1573-0565.","element":"span"}],[{"id":"id-21","text":"[38] ","element":"span"},{"text":"S. Shalev-Shwartz and S. Ben-D. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Understanding Machine Learning: From Theory to Algorithms","element":"span"},{"text":". Cambridge University Press, 2014. doi: 10.1017/CBO9781107298019.","element":"span"}],[{"id":"id-22","text":"[39] ","element":"span"},{"text":"M. Mohri, Afshin Rostamizadeh, and Ameet Talwalkar. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Foundations of Machine Learning","element":"span"},{"text":". MIT Press, 2 edition, 2018.","element":"span"}],[{"id":"id-23","text":"[40] ","element":"span"},{"text":"M. Kearns, M. Li, L. Pitt, and L. G. Valiant. Recent results on Boolean concept learning. In Pat Langley, editor, ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Proceedings of the Fourth International Workshop on Machine Learning","element":"span"},{"text":", pages 337–352. Morgan Kaufmann, 1987. doi: https://doi.org/10.1016/B978-0-934613-41-5.50037-4.","element":"span"}],[{"text":"[41] N. Littlestone. From on-line to batch learning. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Proceedings of the Second Annual Workshop on Computational Learning Theory","element":"span"},{"text":", COLT ’89, page 269–284, San Francisco, CA, USA, 1989. Morgan Kaufmann Publishers Inc. doi: 10.1016/B978-0-08-094829-4.50022-2.","element":"span"}],[{"text":"[42] L. Gretta and E. Price. An improved online reduction from PAC learning to mistake-bounded learning. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"2023 Symposium on Simplicity in Algorithms (SOSA)","element":"span"},{"text":", pages 373–380, 2023. doi: 10.1137/1.9781611977585.ch34.","element":"span"}],[{"text":"[43] N. Cesa-bianchi, A. Conconi, and C. Gentile. On the generalization ability of on-line learning algorithms. In T. Dietterich, S. Becker, and Z. Ghahramani, editors, ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Advances in Neural Information Processing Systems","element":"span"},{"text":", volume 14. MIT Press, 2001.","element":"span"}],[{"id":"id-24","text":"[44] ","element":"span"},{"text":"N. Cesa-Bianchi, A. Conconi, and C. Gentile. ","element":"span"},{"text":"On the generalization ability of on-line learning algorithms. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"IEEE Transactions on Information Theory","element":"span"},{"text":", 50(9):2050–2057, 2004. doi: 10.1109/TIT.2004.833339.","element":"span"}],[{"id":"id-25","text":"[45] ","element":"span"},{"text":"A. L. Blum. Separating distribution-free and mistake-bound learning models over the Boolean domain. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"SIAM Journal on Computing","element":"span"},{"text":", 23(5):990–1000, 1994.","element":"span"}],[{"id":"id-28","text":"[46] E. Hazan. Introduction to online convex optimization. 2021. arXiv:1909.05207.","element":"span"}],[{"id":"id-30","text":"[47] ","element":"span"},{"text":"H.-C. Cheng, M.-H. Hsieh, and P.-C. Yeh. The learnability of unknown quantum measurements. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Quantum Information and Computation","element":"span"},{"text":", 16(7), 2016.","element":"span"}],[{"id":"id-64","text":"[48] ","element":"span"},{"text":"M. C. Caro and I. Datta. Pseudo-dimension of quantum circuits. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Quantum Machine Intelligence","element":"span"},{"text":", 2:14, 2020. doi: 10.1007/s42484-020-00027-5.","element":"span"}],[{"text":"[49] C. M. Popescu. Learning bounds for quantum circuits in the agnostic setting. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Quantum Information Processing","element":"span"},{"text":", 20(9):1–24, 2021.","element":"span"}],[{"id":"id-31","text":"[50] ","element":"span"},{"text":"H. Cai, Q. Ye, and D.-L. Deng. Sample complexity of learning parametric quantum circuits. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Quantum Science and Technology","element":"span"},{"text":", 7(2):025014, 2022.","element":"span"}],[{"id":"id-36","text":"[51] ","element":"span"},{"text":"J. J. Wallman and J. Emerson. ","element":"span"},{"text":"Noise tailoring for scalable quantum computation via randomized compiling. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review A","element":"span"},{"text":", 94:052325, 2016. doi: 10.1103/PhysRevA.94.052325.","element":"span"}],[{"id":"id-41","text":"[52] ","element":"span"},{"text":"O. Regev. On lattices, learning with errors, random linear codes, and cryptography. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Journal of the ACM (JACM)","element":"span"},{"text":", 56(6):1–40, 2009. doi: 10.1145/1568318.1568324.","element":"span"}],[{"text":"[53] P. Ananth, A. Poremba, and V. Vaikuntanathan. Revocable cryptography from learning with errors. ","element":"span"},{"text":"In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Theory of Cryptography Conference","element":"span"},{"text":", pages 93–122. Springer, 2023. ","element":"span"},{"text":"doi: 10.1007/978-3-031-48624-1_4.","element":"span"}],[{"id":"id-42","text":"[54] ","element":"span"},{"text":"D. Aggarwal, H. Bennett, Z. Brakerski, A. Golovnev, R. Kumar, Z. Li, S. Peters, N. StephensDavidowitz, and V. Vaikuntanathan. Lattice problems beyond polynomial time. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Proceedings of the 55th Annual ACM Symposium on Theory of Computing","element":"span"},{"text":", pages 1516–1526, 2023. doi: 10.1145/3564246.3585227.","element":"span"}],[{"id":"id-44","text":"[55] ","element":"span"},{"text":"C. Dwork, V. Feldman, M. Hardt, T. Pitassi, O. Reingold, and A. Roth. Preserving Statistical Validity in Adaptive Data Analysis. ","element":"span"},{"text":"In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing","element":"span"},{"text":", STOC 2015, pages 117–126, New York, NY, USA, 2015. Association for Computing Machinery. doi: 10.1145/2746539.2746580.","element":"span"}],[{"id":"id-45","text":"[56] ","element":"span"},{"text":"R. Bassily, K. Nissim, A. Smith, T. Steinke, U. Stemmer, and J. Ullman. Algorithmic Stability for Adaptive Data Analysis. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"SIAM Journal on Computing","element":"span"},{"text":", 50(3):STOC16–377–STOC16–405, 2016. doi: 10.1137/16M1103646.","element":"span"}],[{"id":"id-51","text":"[57] ","element":"span"},{"text":"G. Chiribella, G. Mauro D’Ariano, and P. Perinotti. Theoretical framework for quantum networks. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review A","element":"span"},{"text":", 80:022339, 2009. doi: 10.1103/PhysRevA.80.022339.","element":"span"}],[{"id":"id-54","text":"[58] M. Born. Zur Quantenmechanik der Stoßvorgänge. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Zeitschrift für Physik","element":"span"},{"text":", 37:863–867, 1926.","element":"span"}],[{"id":"id-55","text":"[59] ","element":"span"},{"text":"M. Ziman. Process positive-operator-valued measure: A mathematical framework for the description of process tomography experiments. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review A","element":"span"},{"text":", 77:062112, 2008. doi: 10.1103/PhysRevA.77.062112.","element":"span"}],[{"id":"id-56","text":"[60] ","element":"span"},{"text":"G. Gutoski and J. Watrous. Toward a general theory of quantum games. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Proceedings of the Thirty-Ninth Annual ACM Symposium on Theory of Computing","element":"span"},{"text":", STOC ’07, pages 565–574, New York, NY, USA, 2007. Association for Computing Machinery. doi: 10.1145/1250790.1250873.","element":"span"}],[{"id":"id-57","text":"[61] ","element":"span"},{"text":"F. A. Pollock, C. Rodríguez-Rosario, T. Frauenheim, M. Paternostro, and K. Modi. NonMarkovian quantum processes: Complete framework and efficient characterization. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review A","element":"span"},{"text":", 97:012127, 2018. doi: 10.1103/PhysRevA.97.012127.","element":"span"}],[{"id":"id-208","text":"[62] ","element":"span"},{"text":"S. Milz and K. Modi. Quantum stochastic processes and quantum non-Markovian phenomena. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"PRX Quantum","element":"span"},{"text":", 2:030201, 2021. doi: 10.1103/PRXQuantum.2.030201.","element":"span"}],[{"id":"id-58","text":"[63] ","element":"span"},{"text":"G. D. Berk, A. J. P. Garner, B. Yadin, K. Modi, and F. A. Pollock. Resource theories of multi-time processes: A window into quantum non-Markovianity. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Quantum","element":"span"},{"text":", 5:435, 2021. ISSN 2521-327X. doi: 10.22331/q-2021-04-20-435.","element":"span"}],[{"id":"id-59","text":"[64] ","element":"span"},{"text":"G. A. L. White, F. A. Pollock, L. C. L. Hollenberg, K. Modi, and C. D. Hill. Non-Markovian quantum process tomography. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"PRX Quantum","element":"span"},{"text":", 3:020344, 2022. doi: 10.1103/PRXQuantum.3. 020344.","element":"span"}],[{"id":"id-62","text":"[65] ","element":"span"},{"text":"A. Abbas, R. King, H.-Y. Huang, W. J. Huggins, R. Movassagh, D. Gilboa, and J. McClean. On quantum backpropagation, information reuse, and cheating measurement collapse. In A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors, ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Advances in Neural Information Processing Systems","element":"span"},{"text":", volume 36, pages 44792–44819. Curran Associates, Inc., 2023.","element":"span"}],[{"id":"id-65","text":"[66] ","element":"span"},{"text":"C.-C. Chen, M. Watabe, K. Shiba, M. Sogabe, K. Sakamoto, and T. Sogabe. ","element":"span"},{"text":"On the expressibility and overfitting of quantum circuit learning. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"ACM Transactions on Quantum Computing","element":"span"},{"text":", 2(2):1–24, 2021. doi: 10.1145/3466797.","element":"span"}],[{"id":"id-148","text":"[67] ","element":"span"},{"text":"Y. Du, Z. Tu, X. Yuan, and D. Tao. Efficient measure for the expressivity of variational quantum algorithms. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review Letters","element":"span"},{"text":", 128(8):080506, 2022. doi: https://doi.org/10. 1103/PhysRevLett.128.080506.","element":"span"}],[{"id":"id-81","text":"[68] ","element":"span"},{"text":"M. C. Caro, H.-Y. Huang, M. Cerezo, K. Sharma, A. Sornborger, L. Cincio, and P. J. Coles. Generalization in quantum machine learning from few training data. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Nature Communications","element":"span"},{"text":", 13:4919, 2022. doi: 10.1038/s41467-022-32550-3.","element":"span"}],[{"id":"id-66","text":"[69] ","element":"span"},{"text":"M. C. Caro, H.-Y. Huang, N. Ezzell, J. Gibbs, A. T. Sornborger, L. Cincio, P. J. Coles, and Z. Holmes. Out-of-distribution generalization for learning quantum dynamics. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Nature Communications","element":"span"},{"text":", 14:3751, 2023. doi: 10.1038/s41467-023-39381-w.","element":"span"}],[{"id":"id-67","text":"[70] ","element":"span"},{"text":"K.-M. Chung and H.-H. Lin. Sample efficient algorithms for learning quantum channels in PAC model and the approximate state discrimination problem. In Min-Hsiu Hsieh, editor, ","element":"span"},{"style":{"fontStyle":"italic"},"text":"16th Conference on the Theory of Quantum Computation, Communication and Cryptography (TQC 2021)","element":"span"},{"text":", volume 197 of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Leibniz International Proceedings in Informatics (LIPIcs)","element":"span"},{"text":", pages 3:1–3:22, Dagstuhl, Germany, 2021. Schloss Dagstuhl – Leibniz-Zentrum für Informatik.","element":"span"}],[{"text":"[71] M. C. Caro. Binary classification with classical instances and quantum labels. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Quantum Machine Intelligence","element":"span"},{"text":", 3:18, 2021. doi: 10.1007/s42484-021-00043-z.","element":"span"}],[{"id":"id-68","text":"[72] ","element":"span"},{"text":"M. Fanizza, Yihui Quek, and M. Rosati. Learning quantum processes without input control. 2022. arXiv:2211.05005.","element":"span"}],[{"id":"id-69","text":"[73] ","element":"span"},{"text":"S. T Flammia and Joel J Wallman. Efficient estimation of Pauli channels. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"ACM Transactions on Quantum Computing","element":"span"},{"text":", 1(1):1–32, 2020. doi: 10.1145/3408039.","element":"span"}],[{"id":"id-70","text":"[74] ","element":"span"},{"text":"S. T Flammia and R. O’Donnell. Pauli error estimation via population recovery. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Quantum","element":"span"},{"text":", 5: 549, 2021. doi: 10.22331/q-2021-09-23-549.","element":"span"}],[{"id":"id-71","text":"[75] ","element":"span"},{"text":"O. Fawzi, A. Oufkir, and D. Stilck França. Lower bounds on learning Pauli channels. 2023. arXiv:2301.09192.","element":"span"}],[{"id":"id-72","text":"[76] ","element":"span"},{"text":"R. Harper, S. T. Flammia, and J. J. Wallman. Efficient learning of quantum noise. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Nature Physics","element":"span"},{"text":", 16(12):1184–1188, 2020. doi: 10.1038/s41567-020-0992-8.","element":"span"}],[{"id":"id-73","text":"[77] ","element":"span"},{"text":"C. Rouzé and D. S. França. Efficient learning of the structure and parameters of local Pauli noise channels. 2023. arXiv:2307.02959.","element":"span"}],[{"id":"id-74","text":"[78] ","element":"span"},{"text":"S. Chen, S. Zhou, A. Seif, and L. Jiang. Quantum advantages for Pauli channel estimation. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review A","element":"span"},{"text":", 105(3):032435, 2022. doi: 10.1103/PhysRevA.105.032435.","element":"span"}],[{"text":"[79] S. Chen, C. Oh, S. Zhou, H.-Y. Huang, and L. Jiang. Tight bounds on Pauli channel learning without entanglement. 2023. arXiv:2309.13461.","element":"span"}],[{"id":"id-75","text":"[80] ","element":"span"},{"text":"S. Chen and Weiyuan Gong. Futility and utility of a few ancillas for Pauli channel learning. 2023. arXiv:2309.14326v1.","element":"span"}],[{"id":"id-76","text":"[81] ","element":"span"},{"text":"S. Chen and Weiyuan Gong. Efficient Pauli channel estimation with logarithmic quantum memory. 2023. arXiv:2309.14326.","element":"span"}],[{"id":"id-78","text":"[82] ","element":"span"},{"text":"A. Rakhlin, K. Sridharan, and A. Tewari. Online learning via sequential complexities. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Journal of Machine Learning Research","element":"span"},{"text":", 16(6):155–186, 2015.","element":"span"}],[{"id":"id-79","text":"[83] ","element":"span"},{"text":"A. Rakhlin, K. Sridharan, and A. Tewari. Sequential complexities and uniform martingale laws of large numbers. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Probability theory and related fields","element":"span"},{"text":", 161:111–153, 2015. doi: 10.1007/ s00440-013-0545-5.","element":"span"}],[{"id":"id-84","text":"[84] ","element":"span"},{"text":"S. Arora, E. Hazan, and S. Kale. The multiplicative weights update method: a meta-algorithm and applications. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Theory of Computing","element":"span"},{"text":", 8(6):121–164, 2012. doi: 10.4086/toc.2012.v008a006.","element":"span"}],[{"id":"id-85","text":"[85] ","element":"span"},{"text":"V. Lyubashevsky, C. Peikert, and O. Regev. On ideal lattices and learning with errors over rings. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Advances in Cryptology–EUROCRYPT 2010: 29th Annual International Conference on the Theory and Applications of Cryptographic Techniques, French Riviera, May 30–June 3, 2010. Proceedings 29","element":"span"},{"text":", pages 1–23. Springer, 2010. doi: 10.1007/978-3-642-13190-5_1.","element":"span"}],[{"id":"id-86","text":"[86] ","element":"span"},{"text":"G. Gutoski. On a measure of distance for quantum strategies. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Journal of Mathematical Physics","element":"span"},{"text":", 53(3):032202, 2012. doi: 10.1063/1.3693621.","element":"span"}],[{"id":"id-90","text":"[87] ","element":"span"},{"text":"H.-Y. Huang, Yunchao Liu, M. Broughton, I. Kim, Anurag Anshu, Zeph Landau, and Jarrod R McClean. Learning shallow quantum circuits. 2024. arXiv:2401.10095.","element":"span"}],[{"id":"id-94","text":"[88] ","element":"span"},{"text":"J. Haah, R. Kothari, and E. Tang. Optimal learning of quantum Hamiltonians from high-temperature Gibbs states. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS)","element":"span"},{"text":", pages 135–146. IEEE, 2022. doi: 10.1109/FOCS54457.2022.00020.","element":"span"}],[{"text":"[89] Daniel Stilck França, Liubov A Markovich, VV Dobrovitski, Albert H Werner, and Johannes Borregaard. Efficient and robust estimation of many-qubit hamiltonians. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Nature Communications","element":"span"},{"text":", 15(1):311, 2024. doi: 10.1038/s41467-023-44012-5.","element":"span"}],[{"text":"[90] Andi Gu, Lukasz Cincio, and Patrick J Coles. ","element":"span"},{"text":"Practical hamiltonian learning with unitary dynamics and gibbs states. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Nature Communications","element":"span"},{"text":", 15(1):312, 2024. ","element":"span"},{"text":"doi: 10.1038/s41467-023-44008-1.","element":"span"}],[{"text":"[91] F. Wilde, A. Kshetrimayum, I. Roth, D. Hangleiter, R. Sweke, and J. Eisert. Scalably learning quantum many-body Hamiltonians from dynamical data. 2022. arXiv:2209.14328.","element":"span"}],[{"text":"[92] W. Yu, J. Sun, Z. Han, and X. Yuan. Robust and efficient Hamiltonian learning. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Quantum","element":"span"},{"text":", 7: 1045, 2023. ISSN 2521-327X. doi: 10.22331/q-2023-06-29-1045.","element":"span"}],[{"text":"[93] H.-Y. Huang, Yu Tong, Di Fang, and Yuan Su. Learning many-body Hamiltonians with Heisenberg-limited scaling. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review Letters","element":"span"},{"text":", 130(20):200403, 2023. doi: 10.1103/ PhysRevLett.130.200403.","element":"span"}],[{"text":"[94] H. Li, Y. Tong, H. Ni, T. Gefen, and L. Ying. Heisenberg-limited Hamiltonian learning for interacting bosons. 2023. arXiv:2307.04690.","element":"span"}],[{"text":"[95] T. Möbus, A. Bluhm, M. C. Caro, A. H. Werner, and C. Rouzé. Dissipation-enabled bosonic Hamiltonian learning via new information-propagation bounds. 2023. arXiv:2307.15026.","element":"span"}],[{"text":"[96] J. Castaneda and N. Wiebe. Hamiltonian learning via shadow tomography of pseudo-Choi states. 2023. arXiv:2308.13020.","element":"span"}],[{"text":"[97] A. Bakshi, A. Liu, A. Moitra, and E. Tang. Learning quantum Hamiltonians at any temperature in polynomial time. 2023. arXiv:2310.02243.","element":"span"}],[{"text":"[98] J. Haah, R. Kothari, and E. Tang. Learning quantum Hamiltonians from high-temperature Gibbs states and real-time evolutions. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Nature Physics","element":"span"},{"text":", pages 1–5, 2024. ","element":"span"},{"text":"doi: 10.1038/ s41567-023-02376-x.","element":"span"}],[{"text":"[99] A. Bluhm, M. C. Caro, and A. Oufkir. Hamiltonian property testing. 2024. arXiv:2403.02968.","element":"span"}],[{"id":"id-95","text":"[100] ","element":"span"},{"text":"Ainesh Bakshi, Allen Liu, Ankur Moitra, and Ewin Tang. Structure learning of Hamiltonians from real-time evolution. 2024. URL ","element":"span"},{"href":"https://arxiv.org/abs/2405.00082","text":"https://arxiv.org/abs/2405.00082","element":"a"},{"text":". arXiv preprint arXiv:2405.00082.","element":"span"}],[{"id":"id-110","text":"[101] ","element":"span"},{"text":"J. Watrous. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"The theory of quantum information","element":"span"},{"text":". Cambridge University Press, 2018. doi: 10.1017/9781316848142.","element":"span"}],[{"id":"id-115","text":"[102] ","element":"span"},{"text":"W. Dür, M. Hein, J. I. Cirac, and H.-J. Briegel. Standard forms of noisy quantum operations via depolarization. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review A","element":"span"},{"text":", 72:052326, 2005. doi: 10.1103/PhysRevA.72.052326.","element":"span"}],[{"id":"id-117","text":"[103] ","element":"span"},{"text":"J. Burniston, M. Grabowecky, C. M. Scandolo, G. Chiribella, and G. Gour. ","element":"span"},{"text":"Necessary and sufficient conditions on measurements of quantum channels. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences","element":"span"},{"text":", 476(2236):20190832, 2020. doi: 10.1098/rspa.2019.0832.","element":"span"}],[{"id":"id-119","text":"[104] ","element":"span"},{"text":"A. Y. Kitaev. Quantum computations: algorithms and error correction. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Russian Mathematical Surveys","element":"span"},{"text":", 52(6):1191, 1997. doi: 10.1070/RM1997v052n06ABEH002155.","element":"span"}],[{"id":"id-120","text":"[105] ","element":"span"},{"text":"G. Gutoski. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Quantum strategies and local operations","element":"span"},{"text":". PhD thesis, University of Waterloo, 2010. ","element":"span"},{"href":"https://arxiv.org/abs/1003.0038","text":"https://arxiv.org/abs/1003.0038","element":"a"},{"text":".","element":"span"}],[{"id":"id-122","text":"[106] ","element":"span"},{"text":"G. Chiribella, G. M. D’Ariano, and P. Perinotti. Quantum circuit architecture. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review Letters","element":"span"},{"text":", 101:060401, 2008. doi: 10.1103/PhysRevLett.101.060401.","element":"span"}],[{"id":"id-128","text":"[107] ","element":"span"},{"text":"A. Lowe. Learning Quantum States Without Entangled Measurements. Master’s thesis, University of Waterloo, 2021.","element":"span"}],[{"id":"id-129","text":"[108] ","element":"span"},{"text":"Y. Freund and R. E. Schapire. Adaptive Game Playing Using Multiplicative Weights. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Games and Economic Behavior","element":"span"},{"text":", 29(1):79–103, 1999. ISSN 0899-8256.","element":"span"}],[{"id":"id-130","text":"[109] ","element":"span"},{"text":"N. Littlestone and Manfred K Warmuth. The weighted majority algorithm. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Information and computation","element":"span"},{"text":", 108(2):212–261, 1994. doi: 10.1006/inco.1994.1009.","element":"span"}],[{"id":"id-131","text":"[110] ","element":"span"},{"text":"S. Arora and S. Kale. A Combinatorial, primal-dual approach to semidefinite programs. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"J. ACM","element":"span"},{"text":", 63(2), 2016. ISSN 0004-5411. doi: 10.1145/2837020.","element":"span"}],[{"id":"id-132","text":"[111] ","element":"span"},{"text":"S. Arora, E. Hazan, and S. Kale. Fast algorithms for approximate semidefinite programm. using the multiplicative weights update method. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"46th Annual IEEE Symposium on Foundations of Computer Science (FOCS’05)","element":"span"},{"text":", pages 339–348, 2005.","element":"span"}],[{"id":"id-133","text":"[112] ","element":"span"},{"text":"R. Jain, Z. Ji, S. Upadhyay, and J. Watrous. QIP = PSPACE. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Journal of the ACM","element":"span"},{"text":", 58(6), 2011. ISSN 0004-5411.","element":"span"}],[{"id":"id-135","text":"[113] ","element":"span"},{"text":"Koji Tsuda, Gunnar Rätsch, and Manfred K. Warmuth. Matrix exponentiated gradient updates for on-line learning and Bregman projection. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Journal of Machine Learning Research","element":"span"},{"text":", 6(34):995–1018, 2005.","element":"span"}],[{"id":"id-137","text":"[114] ","element":"span"},{"text":"Y. Freund and R. E. Schapire. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Journal of Computer and System Sciences","element":"span"},{"text":", 55(1):119–139, 1997. ISSN 0022-0000.","element":"span"}],[{"id":"id-145","text":"[115] ","element":"span"},{"text":"D. Petz. Bregman divergence as relative operator entropy. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Acta Mathematica Hungarica","element":"span"},{"text":", 116: 127–131, 2007.","element":"span"}],[{"id":"id-146","text":"[116] ","element":"span"},{"text":"F. G. S. L. Brandão, W. Chemissany, N. Hunter-Jones, R. Kueng, and J. Preskill. Models of quantum complexity growth. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"PRX Quantum","element":"span"},{"text":", 2:030316, 2021. doi: 10.1103/PRXQuantum.2. 030316.","element":"span"}],[{"id":"id-147","text":"[117] ","element":"span"},{"text":"J. Haferkamp, P. Faist, N. B. T. Kothakonda, J. Eisert, and N. Yunger Halpern. Linear growth of quantum circuit complexity. 2021. arXiv:2106.05305.","element":"span"}],[{"id":"id-161","text":"[118] ","element":"span"},{"text":"R. Vershynin. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"High-Dimensional Probability: An Introduction with Applications in Data Science","element":"span"},{"text":". Cambridge University Press, 2018. doi: 10.1017/9781108231596.","element":"span"}],[{"id":"id-172","text":"[119] ","element":"span"},{"text":"S. Floyd and M. Warmuth. Sample compression, learnability, and the Vapnik-Chervonenkis dimension. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Machine learning","element":"span"},{"text":", 21:269–304, 1995. doi: 10.1023/A:1022660318680.","element":"span"}],[{"id":"id-173","text":"[120] ","element":"span"},{"text":"S. Hanneke, A. Kontorovich, and M. Sadigurschi. Sample compression for real-valued learners. In A. Garivier and S. Kale, editors, ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Proceedings of the 30th International Conference on Algorithmic Learning Theory","element":"span"},{"text":", volume 98 of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Proceedings of Machine Learning Research","element":"span"},{"text":", pages 466–488. PMLR, 2019.","element":"span"}],[{"id":"id-175","text":"[121] ","element":"span"},{"text":"J. D. Halpern and A. Levy. The ordering theorem does not imply the axiom of choice. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Notices of the American Mathematical Society","element":"span"},{"text":", 11:56, 1964. doi: 10.1007/978-94-015-8988-8_4.","element":"span"}],[{"text":"[122] J. D. Halpern and A. Levy. The Boolean prime ideal theorem does not imply the axiom of choice. Axiomatic Set Theory, Proc. Sympos. Pure Math. 13, Part I, 83-134 (1971)., 1971.","element":"span"}],[{"id":"id-176","text":"[123] ","element":"span"},{"text":"C. González. Dense orderings, partitions and weak forms of choice. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Fundamenta Mathematicae","element":"span"},{"text":", 147(1):11–25, 1995.","element":"span"}],[{"id":"id-177","text":"[124] ","element":"span"},{"text":"A. A. Fraenkel and Y. Bar-Hillel. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Foundations of Set Theory","element":"span"},{"text":". Elsevier, Atlantic Highlands, NJ, USA, 1973.","element":"span"}],[{"id":"id-180","text":"[125] ","element":"span"},{"text":"V. N. Vapnik and A. Ya. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Theory of Probability & Its Applications","element":"span"},{"text":", 16(2):264–280, 1971. doi: 10.1137/1116025.","element":"span"}],[{"id":"id-184","text":"[126] ","element":"span"},{"text":"O. Borisovich Lupanov. Ob odnom métodé sintéza shém (on a method of circuit synthesis). ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Izvéstiá vysših učébnyh zavédénij, Radiofizika","element":"span"},{"text":", 1:120–140, 1958.","element":"span"}],[{"id":"id-185","text":"[127] ","element":"span"},{"text":"T. Toffoli. Reversible computing. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"International colloquium on automata, languages, and programming","element":"span"},{"text":", pages 632–644. Springer, 1980. doi: 10.1007/3-540-10003-2_104.","element":"span"}],[{"id":"id-186","text":"[128] ","element":"span"},{"text":"J. J. Vartiainen, M. Möttönen, and M. M. Salomaa. Efficient decomposition of quantum gates. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review Letters","element":"span"},{"text":", 92(17):177902, 2004. doi: 10.1103/PhysRevLett.92.177902.","element":"span"}],[{"text":"[129] M. Möttönen, J. J. Vartiainen, V. Bergholm, and M. M. Salomaa. Quantum circuits for general multiqubit gates. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review Letters","element":"span"},{"text":", 93(13):130502, 2004. doi: 10.1103/PhysRevLett.93. 130502.","element":"span"}],[{"id":"id-187","text":"[130] ","element":"span"},{"text":"V. V. Shende, S. S Bullock, and I. L. Markov. Synthesis of quantum logic circuits. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Proceedings of the 2005 Asia and South Pacific Design Automation Conference","element":"span"},{"text":", pages 272–275, 2005. doi: 10.1145/1120725.1120847.","element":"span"}],[{"id":"id-190","text":"[131] ","element":"span"},{"text":"A. A. Mele and Y. Herasymenko. Efficient learning of quantum states prepared with few fermionic non-gaussian gates. 2024. arXiv:2402.18665.","element":"span"}],[{"id":"id-191","text":"[132] ","element":"span"},{"text":"O. Goldreich, S. Goldwasser, and S. Micali. How to construct random functions. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Journal of the ACM (JACM)","element":"span"},{"text":", 33(4):792–807, 1986.","element":"span"}],[{"id":"id-194","text":"[133] ","element":"span"},{"text":"A. Banerjee, C. Peikert, and A. Rosen. Pseudorandom functions and lattices. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Annual International Conference on the Theory and Applications of Cryptographic Techniques","element":"span"},{"text":", pages 719–737. Springer, 2012. doi: 10.1007/978-3-642-29011-4_42.","element":"span"}],[{"id":"id-195","text":"[134] ","element":"span"},{"text":"S. Arunachalam, A. B. Grilo, and A. Sundaram. Quantum hardness of learning shallow classical circuits. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"SIAM Journal on Computing","element":"span"},{"text":", 50(3):972–1013, 2021. doi: 10.1137/20M1344202.","element":"span"}],[{"id":"id-200","text":"[135] ","element":"span"},{"text":"J. Watrous. Semidefinite programs for completely bounded norms. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Theory of Computing","element":"span"},{"text":", 5: 217–238, 2009. doi: 10.4086/toc.2009.v005a011.","element":"span"}],[{"id":"id-205","text":"[136] ","element":"span"},{"text":"R. R. Tucci. Quantum Bayesian nets. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"International Journal of Modern Physics B","element":"span"},{"text":", 09(03): 295–337, 1995. doi: 10.1142/S0217979295000148.","element":"span"}],[{"text":"[137] D. Beckman, Daniel Gottesman, M. A. Nielsen, and J. Preskill. Causal and localizable quantum operations. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review A","element":"span"},{"text":", 64:052309, 2001. doi: 10.1103/PhysRevA.64.052309.","element":"span"}],[{"text":"[138] T. Eggeling, D. Schlingemann, and R. F. Werner. Semicausal operations are semilocalizable. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Europhysics Letters","element":"span"},{"text":", 57(6):782, 2002. doi: 10.1209/epl/i2002-00579-4.","element":"span"}],[{"text":"[139] O. Oreshkov, F. Costa, and C. Brukner. Quantum correlations with no causal order. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Nature Communications","element":"span"},{"text":", 3:1092, 2012.","element":"span"}],[{"text":"[140] F. Costa and S. Shrapnel. Quantum causal modelling. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"New Journal of Physics","element":"span"},{"text":", 18(6):063032, 2016. doi: 10.1088/1367-2630/18/6/063032.","element":"span"}],[{"id":"id-206","text":"[141] ","element":"span"},{"text":"J.-M. A. Allen, J. Barrett, D. C. Horsman, C. M. Lee, and R. W. Spekkens. Quantum common causes and quantum causal models. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review X","element":"span"},{"text":", 7:031021, 2017. ","element":"span"},{"text":"doi: 10.1103/PhysRevX.7.031021.","element":"span"}],[{"id":"id-207","text":"[142] ","element":"span"},{"text":"D. Kretschmann and R. F. Werner. Quantum channels with memory. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review A","element":"span"},{"text":", 72: 062323, 2005.","element":"span"}],[{"id":"id-209","text":"[143] ","element":"span"},{"text":"Y. Aharonov, S. Popescu, J. Tollaksen, and L. Vaidman. Multiple-time states and multiple-time measurements in quantum mechanics. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Physical Review A","element":"span"},{"text":", 79:052110, 2009. doi: 10.1103/ PhysRevA.79.052110.","element":"span"}],[{"id":"id-231","text":"[144] ","element":"span"},{"text":"I. S. Dhillon and J. A. Tropp. Matrix Nearness Problems with Bregman Divergences. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"SIAM Journal on Matrix Analysis and Applications","element":"span"},{"text":", 29(4):1120–1146, 2008.","element":"span"}]]},{"heading":"A From qubits to qudits","paragraphs":[[{"text":"Although all of our results have been phrased for ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit systems and channels, they apply equally well to qudit systems and qudit channels. This essentially amounts to replacing all of the Pauli ","element":"span"},{"text":"operators ","element":"span"},{"style":{"height":12},"width":255.07,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-0.png","element":"img","alt":" P z,x with the","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"Heisenberg–Weyl operators","element":"span"},{"text":", sometimes known as the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"qudit/generalized Pauli operators","element":"span"},{"text":". These operators are defined as [","element":"span"},{"href":"#id-110","referenceIndex":101,"text":"101","element":"a"},{"text":"]","element":"span"}],[{"style":{"width":"72%"},"width":1361,"height":339,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-1.png","element":"img"}],[{"text":"where the addition in the definition of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"X","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"x","element":"span"},{"text":") is performed modulo ","element":"span"},{"style":{"fontStyle":"italic"},"text":"d","element":"span"},{"text":", with ","element":"span"},{"style":{"height":17.6},"width":266.46,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-2.png","element":"img","alt":" d ∈ {2, 3, . . . }","inline":true},{"text":". These operators are unitary and orthogonal, i.e.,","element":"span"}],[{"style":{"width":"81%"},"width":1533,"height":57,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-3.png","element":"img"}],[{"text":"for ","element":"span"},{"style":{"height":17.6},"width":521.78,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-4.png","element":"img","alt":" z, x, z′, x′ ∈ {0, 1, . . . , d−1}","inline":true},{"text":". Consequently, they form a basis for the vector space L(","element":"span"},{"style":{"height":15.53},"width":49.51,"height":38.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-5.png","element":"img","alt":"Cd","inline":true},{"text":") of linear operators acting on ","element":"span"},{"style":{"height":15.53},"width":158.21,"height":38.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-6.png","element":"img","alt":" Cd. The","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"qudit Bell states ","element":"span"},{"text":"are then defined as","element":"span"}],[{"id":"id-223","style":{"width":"85%"},"width":1594,"height":124,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-7.png","element":"img"}],[{"text":"The qudit Bell state vectors ","element":"span"},{"style":{"height":17.6},"width":107.58,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-8.png","element":"img","alt":" |Φz,x⟩","inline":true,"padRight":true},{"text":"form an orthonormal basis for ","element":"span"},{"style":{"height":16.74},"width":151.25,"height":41.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-9.png","element":"img","alt":" Cd ⊗ Cd","inline":true},{"text":", and the qudit Bell states Φ","element":"span"},{"style":{"height":8.4},"width":45.49,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-10.png","element":"img","alt":"z,x ","inline":true,"padRight":true},{"text":"form a POVM, meaning that ","element":"span"},{"style":{"height":25.01},"width":434.24,"height":62.53,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-11.png","element":"img","alt":"�d−1z,x=0 Φz,x = 1d ⊗ 1d.","inline":true}],[{"text":"The qudit/generalized Pauli channels are defined analogously to ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit Pauli channels as","element":"span"}],[{"style":{"width":"66%"},"width":1246,"height":128,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-12.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":17.6},"width":163.98,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-13.png","element":"img","alt":" p(z, x) ∈","inline":true,"padRight":true},{"text":"[0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1] and ","element":"span"},{"style":{"height":23.7},"width":1382.36,"height":59.24,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-14.png","element":"img","alt":"�d−1z,x=0 p(z, x) = 1. The Choi representations of these channels have the","inline":true,"padRight":true},{"text":"form","element":"span"}],[{"id":"id-204","style":{"width":"83%"},"width":1564,"height":119,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-15.png","element":"img"}],[{"text":"It follows from this (and using orthonormality of the Bell states) that the error rates ","element":"span"},{"style":{"fontStyle":"italic"},"text":"p","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"z, x","element":"span"},{"text":") can be obtained as","element":"span"}],[{"id":"id-227","style":{"width":"63%"},"width":1181,"height":76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-16.png","element":"img"}],[{"text":"for all ","element":"span"},{"style":{"height":17.6},"width":430.33,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-17.png","element":"img","alt":" z, x ∈ {0, 1, . . . , d − 1}","inline":true},{"text":". In other words, we can recover the error rates by performing the qudit Bell basis measurement on the Choi state ","element":"span"},{"style":{"height":21.29},"width":119.53,"height":53.22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-18.png","element":"img","alt":"1dC(N","inline":true},{"text":") of the channel. Conversely, every positive ","element":"span"},{"text":"semi-definite bipartite operator ","element":"span"},{"style":{"height":12.8},"width":76.15,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-19.png","element":"img","alt":" Y ∈","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":16.73},"width":153.79,"height":41.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-20.png","element":"img","alt":"Cd ⊗ Cd","inline":true},{"text":") of the form (","element":"span"},{"href":"#id-204","text":"A.5","element":"a"},{"text":") corresponds to a Pauli channel with error rates gives by ","element":"span"},{"style":{"height":21.29},"width":966.48,"height":53.23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-21.png","element":"img","alt":" p(z, x) = 1dTr[Φz,xY ] for all z, x ∈ {0, 1, . . . , d − 1}.","inline":true}],[{"text":"For a quantum channel ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":": L(","element":"span"},{"style":{"height":19.53},"width":124.26,"height":48.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-22.png","element":"img","alt":"Cd) →","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":19.53},"width":356.03,"height":48.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-23.png","element":"img","alt":"Cd), d ∈ {2, 3, . . . }","inline":true},{"text":", we define its (qudit) Pauli-twirled version as","element":"span"}],[{"id":"id-222","style":{"width":"73%"},"width":1372,"height":120,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-24.png","element":"img"}],[{"text":"where the superscript “","element":"span"},{"text":"W","element":"span"},{"text":"” in ","element":"span"},{"style":{"height":16.33},"width":74.23,"height":40.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-25.png","element":"img","alt":" N W","inline":true,"padRight":true},{"text":"refers to the set ","element":"span"},{"style":{"height":17.6},"width":225.68,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-26.png","element":"img","alt":" W := {W z,x","inline":true,"padRight":true},{"text":": ","element":"span"},{"style":{"height":17.6},"width":445.31,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-27.png","element":"img","alt":" z, x ∈ {0, 1, . . . , d − 1}}","inline":true,"padRight":true},{"text":"of qudit Pauli operators. In Proposition ","element":"span"},{"href":"#id-203","text":"59","element":"a"},{"text":", we show that the twirled channel ","element":"span"},{"style":{"height":16.33},"width":74.23,"height":40.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/69-28.png","element":"img","alt":" N W ","inline":true,"padRight":true},{"text":"is indeed a Pauli channel.","element":"span"}],[{"id":"id-210","style":{"width":"63%"},"width":1188,"height":571,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/70-0.png","element":"img"}],[{"text":"Figure 5: ","element":"figcaption","subtype":"caption"},{"text":"(Top) A multi-time quantum process with ","element":"figcaption","subtype":"caption"},{"style":{"fontStyle":"italic"},"text":"r ","element":"figcaption","subtype":"caption"},{"text":"= 4 time steps. The input systems are ","element":"figcaption","subtype":"caption"},{"style":{"height":14.8},"width":182.21,"height":37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/70-1.png","element":"img","alt":" A1, . . . , A4","inline":true},{"text":", the output systems are ","element":"figcaption","subtype":"caption"},{"style":{"height":14},"width":182.88,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/70-2.png","element":"img","alt":" B1, . . . , B4","inline":true},{"text":", and the memory systems are ","element":"figcaption","subtype":"caption"},{"style":{"height":14},"width":216.04,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/70-3.png","element":"img","alt":" M1, M2, M3.","inline":true,"padRight":true},{"text":"(Bottom) Every multi-time process is associated with the channel ","element":"figcaption","subtype":"caption"},{"style":{"height":14.98},"width":72.24,"height":37.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/70-4.png","element":"img","alt":" N [r]","inline":true},{"text":", obtained by collapsing the causal ordering of the inputs and outputs.","element":"figcaption","subtype":"caption"}]]},{"heading":"B Multi-time quantum processes","paragraphs":[[{"text":"In this section, we provide some background on multi-time quantum processes. Such objects are also known as “quantum strategies” [","element":"span"},{"href":"#id-56","referenceIndex":60,"text":"60","element":"a"},{"text":"] and “quantum combs” [","element":"span"},{"href":"#id-51","referenceIndex":57,"text":"57","element":"a"},{"text":"], and they also constitute specific examples of quantum causal networks [","element":"span"},{"href":"#id-51","referenceIndex":57,"text":"57","element":"a"},{"text":", ","element":"span"},{"href":"#id-205","referenceIndex":136,"text":"136","element":"a"},{"text":"–","element":"span"},{"href":"#id-206","referenceIndex":141,"text":"141","element":"a"},{"text":"] and quantum channels with memory [","element":"span"},{"href":"#id-207","referenceIndex":142,"text":"142","element":"a"},{"text":"]. They also provide a model for discrete-time non-Markovian quantum stochastic processes [","element":"span"},{"href":"#id-57","referenceIndex":61,"text":"61","element":"a"},{"text":", ","element":"span"},{"href":"#id-208","referenceIndex":62,"text":"62","element":"a"},{"text":"], and it is in this context that they are known as “multi-time quantum processes”—see also Refs. [","element":"span"},{"href":"#id-58","referenceIndex":63,"text":"63","element":"a"},{"text":", ","element":"span"},{"href":"#id-209","referenceIndex":143,"text":"143","element":"a"},{"text":"].","element":"span"}],[{"id":"id-107","style":{"fontWeight":"bold"},"text":"B.1 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Definitions and basic properties","element":"span"}],[{"text":"A general multi-time process is depicted in Figure ","element":"span"},{"href":"#id-210","text":"5 ","element":"a"},{"text":"(top) as the comb object in blue with ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r ","element":"span"},{"text":"= 4 time steps. The input systems are ","element":"span"},{"style":{"height":16},"width":267.4,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/70-5.png","element":"img","alt":" A1, A2, . . . , Ar","inline":true},{"text":", the output systems are ","element":"span"},{"style":{"height":15.2},"width":268.52,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/70-6.png","element":"img","alt":" B1, B2, . . . , Br","inline":true},{"text":", and the memory systems are ","element":"span"},{"style":{"height":15.2},"width":340.79,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/70-7.png","element":"img","alt":" M1, M2, . . . , Mr−1","inline":true},{"text":". Every multi-time process is associated with a quantum channel ","element":"span"},{"style":{"height":20.41},"width":1191.61,"height":51.02,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/70-8.png","element":"img","alt":" N [r] : L(HA1 ⊗ HA2 ⊗ · · · ⊗ HAr) → L(HB1 ⊗ HB2 ⊗ · · · ⊗ HBr","inline":true},{"text":"), defined by concatenating the maps ","element":"span"},{"style":{"height":15.94},"width":54.23,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/70-9.png","element":"img","alt":" N i","inline":true,"padRight":true},{"text":"in the manner shown in Figure ","element":"span"},{"href":"#id-210","text":"5 ","element":"a"},{"text":"(bottom). As shown in Refs. [","element":"span"},{"href":"#id-51","referenceIndex":57,"text":"57","element":"a"},{"text":", ","element":"span"},{"href":"#id-56","referenceIndex":60,"text":"60","element":"a"},{"text":"], due to the causal constraints, the Choi representation of the channel ","element":"span"},{"style":{"height":17.13},"width":76.98,"height":42.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/70-10.png","element":"img","alt":" N [r]","inline":true},{"text":", which completely characterizes the multi-time process, is in one-to-one correspondence with a set of so-called ","element":"span"},{"style":{"fontStyle":"italic"},"text":"comb operators","element":"span"},{"text":", which have a very specific structure based on the causal ordering of the individual elements of the process.","element":"span"}],[{"id":"id-212","style":{"fontWeight":"bold"},"text":"Definition 51 ","element":"span"},{"text":"(Comb operator for multi-time quantum process)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Every multi-time quantum process with ","element":"span"},{"style":{"height":12.8},"width":115.64,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/70-11.png","element":"img","alt":" r ∈ N","inline":true,"padRight":true},{"text":"time steps, as depicted in Figure ","element":"span"},{"href":"#id-210","text":"5","element":"a"},{"text":", is represented by a positive semi-definite ","element":"span"},{"style":{"fontStyle":"italic"},"text":"comb operator ","element":"span"},{"style":{"height":14.62},"width":50.06,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/70-12.png","element":"img","alt":" Nr","inline":true},{"text":", defined to be the Choi representation of the quantum channel associated with the process (see Figure ","element":"span"},{"href":"#id-210","text":"5","element":"a"},{"text":"). For every comb operator ","element":"span"},{"style":{"height":14.62},"width":50.06,"height":36.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/70-13.png","element":"img","alt":" Nr","inline":true},{"text":", there exist positive semi-definite operators ","element":"span"},{"style":{"height":26.85},"width":674.79,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/70-14.png","element":"img","alt":"Nk ∈ L(H(k)A,B), k ∈ {1, 2, . . . , r − 1}","inline":true},{"text":", such that the following constraints are satisfied:","element":"span"}],[{"id":"id-211","style":{"width":"78%"},"width":1470,"height":215,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/70-15.png","element":"img"}],[{"text":"A positive semi-definite operator ","element":"span"},{"style":{"height":26.85},"width":228.4,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-0.png","element":"img","alt":" P ∈ L(H(r)A,B","inline":true},{"text":") is the comb operator of an ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r","element":"span"},{"text":"-step multi-time quantum ","element":"span"},{"text":"process with input systems ","element":"span"},{"style":{"height":16},"width":196.35,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-1.png","element":"img","alt":" A1, . . . , Ar","inline":true,"padRight":true},{"text":"and output systems ","element":"span"},{"style":{"height":15.2},"width":197.1,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-2.png","element":"img","alt":" B1, . . . , Br","inline":true,"padRight":true},{"text":"if and only if there exists a set ","element":"span"},{"style":{"height":18.27},"width":161,"height":45.68,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-3.png","element":"img","alt":"{Nk}rk=1 ","inline":true,"padRight":true},{"text":"of positive semi-definite operators such that ","element":"span"},{"style":{"height":14.62},"width":141.64,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-4.png","element":"img","alt":" P = Nr","inline":true,"padRight":true},{"text":"and the constraints in (","element":"span"},{"href":"#id-211","text":"B.1","element":"a"},{"text":")–(","element":"span"},{"href":"#id-211","text":"B.3","element":"a"},{"text":") are ","element":"span"},{"text":"satisfied. We let ","element":"span"},{"style":{"height":17.6},"width":559.6,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-5.png","element":"img","alt":" COMBr(A1, . . . , Ar; B1, . . . Br","inline":true},{"text":") denote the convex set of all operators in L(","element":"span"},{"style":{"height":26.85},"width":97.64,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-6.png","element":"img","alt":"H(r)A,B","inline":true},{"text":") ","element":"span"},{"text":"representing ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r","element":"span"},{"text":"-step multi-time quantum processes with input systems ","element":"span"},{"style":{"height":16},"width":196.35,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-7.png","element":"img","alt":" A1, . . . , Ar","inline":true,"padRight":true},{"text":"and output systems","element":"span"}],[{"style":{"width":"13%"},"width":257,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-8.png","element":"img"}],[{"text":"To understand Definition ","element":"span"},{"href":"#id-212","text":"51","element":"a"},{"text":", let us see how the constraints on the comb operators in (","element":"span"},{"href":"#id-211","text":"B.1","element":"a"},{"text":")–(","element":"span"},{"href":"#id-211","text":"B.3","element":"a"},{"text":") are manifested in the Choi representation of the channel ","element":"span"},{"style":{"height":17.13},"width":77.7,"height":42.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-9.png","element":"img","alt":" N [4]","inline":true,"padRight":true},{"text":"depicted in Figure ","element":"span"},{"href":"#id-210","text":"5 ","element":"a"},{"text":"(bottom). By definition, we have","element":"span"}],[{"style":{"width":"99%"},"width":1872,"height":391,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-10.png","element":"img"}],[{"style":{"height":19.61},"width":180.54,"height":49.04,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-11.png","element":"img","alt":"= TrM3A′4","inline":true}],[{"style":{"width":"4%"},"width":91,"height":45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-12.png","element":"img"}],[{"style":{"height":53.7},"width":1790.26,"height":134.26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-13.png","element":"img","alt":"= TrM3��N 3M2A′3→M3B3 ◦ N 2M1A′2→M2B2 ◦ N 1A′1→M1B1�(ΓA1A′1 ⊗ ΓA2A′2 ⊗ ΓA3A′3)�⊗ TrA′4[ΓA4A′4]� �� 1A4","inline":true},{"style":{"height":21.93},"width":235.66,"height":54.82,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-14.png","element":"img","alt":"= N3 ⊗ 1A4,","inline":true}],[{"text":"which is precisely the constraint in (","element":"span"},{"href":"#id-211","text":"B.1","element":"a"},{"text":"), where in the last line we let","element":"span"}],[{"style":{"width":"94%"},"width":1778,"height":73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-15.png","element":"img"}],[{"text":"In a similar manner, we find that","element":"span"}],[{"text":"Tr","element":"span"},{"style":{"height":21.93},"width":1526.06,"height":54.82,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-16.png","element":"img","alt":"B3[N3] = N2 ⊗ 1A3, (B.7)","inline":true},{"style":{"height":28.8},"width":759.63,"height":72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-17.png","element":"img","alt":"N2 := TrM2��N 2M1A′2→M2B2 ◦ N 1A′1→M1B1","inline":true}],[{"text":"Tr","element":"span"},{"style":{"height":21.93},"width":1526.06,"height":54.82,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-18.png","element":"img","alt":"B2[N2] = N1 ⊗ 1A2, (B.9)","inline":true},{"style":{"height":28.8},"width":464.35,"height":72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-19.png","element":"img","alt":"N1 := TrM1��N 1A′1→M1B1","inline":true}],[{"style":{"width":"83%"},"width":1566,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-20.png","element":"img"}],[{"text":"which reproduces the constraints in (","element":"span"},{"href":"#id-211","text":"B.2","element":"a"},{"text":") and (","element":"span"},{"href":"#id-211","text":"B.3","element":"a"},{"text":").","element":"span"}],[{"text":"We can also consider a multi-time process with a measurement in the final time step, sometimes called a “measuring strategy”.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Definition 52 ","element":"span"},{"text":"(Comb operators for multi-time quantum process with measurement)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Every multi-time quantum process with measurement, consisting of ","element":"span"},{"style":{"height":12.8},"width":127.71,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/71-21.png","element":"img","alt":" r ∈ N","inline":true,"padRight":true},{"text":"time steps and input systems","element":"span"}],[{"id":"id-213","style":{"width":"71%"},"width":1347,"height":231,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-0.png","element":"img"}],[{"text":"Figure 6: ","element":"figcaption","subtype":"caption"},{"text":"Concatenation of a multi-time process ","element":"figcaption","subtype":"caption"},{"style":{"height":17.38},"width":785.82,"height":43.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-1.png","element":"img","alt":" N [4] with r = 4 time steps, represented by the","inline":true,"padRight":true},{"text":"blue quantum comb, with a corresponding tester ","element":"figcaption","subtype":"caption"},{"style":{"height":14.58},"width":58.48,"height":36.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-2.png","element":"img","alt":" E[4]","inline":true},{"text":", which is represented by the red quantum comb. The operations ","element":"figcaption","subtype":"caption"},{"style":{"height":13.79},"width":178.78,"height":34.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-3.png","element":"img","alt":" N i and Ei ","inline":true,"padRight":true},{"text":"can be arbitrary quantum channels, and they can also more generally be arbitrary Hermiticity-preserving maps.","element":"figcaption","subtype":"caption"}],[{"style":{"height":16},"width":196.35,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-4.png","element":"img","alt":"A1, . . . , Ar","inline":true,"padRight":true},{"text":"and output systems ","element":"span"},{"style":{"height":15.2},"width":197.1,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-5.png","element":"img","alt":" B1, . . . , Br","inline":true},{"text":", is represented by a set ","element":"span"},{"style":{"height":18.22},"width":144.4,"height":45.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-6.png","element":"img","alt":" {Nr;x}x","inline":true,"padRight":true},{"text":"of positive semi-definite operators ","element":"span"},{"style":{"height":17.02},"width":133.57,"height":42.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-7.png","element":"img","alt":" Nr;x ∈","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":26.85},"width":97.64,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-8.png","element":"img","alt":"H(r)A,B","inline":true},{"text":"), such that ","element":"span"},{"style":{"height":18.22},"width":805.97,"height":45.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-9.png","element":"img","alt":" �x Nr;x ∈ COMBr(A1, . . . , Ar; B1, . . . , Br","inline":true},{"text":"). ","element":"span"},{"text":"A finite set ","element":"span"},{"style":{"height":17.6},"width":110.46,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-10.png","element":"img","alt":"{Sx}x","inline":true,"padRight":true},{"text":"of positive semi-definite operators in L(","element":"span"},{"style":{"height":26.85},"width":97.63,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-11.png","element":"img","alt":"H(r)A,B","inline":true},{"text":") defines an ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r","element":"span"},{"text":"-step multi-time quantum process ","element":"span"},{"text":"with measurement, with input systems ","element":"span"},{"style":{"height":16},"width":196.36,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-12.png","element":"img","alt":" A1, . . . , Ar","inline":true,"padRight":true},{"text":"and output systems ","element":"span"},{"style":{"height":15.2},"width":197.09,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-13.png","element":"img","alt":" B1, . . . , Br","inline":true},{"text":", if and only if","element":"span"}],[{"style":{"width":"44%"},"width":826,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-14.png","element":"img"}],[{"text":"A measurement, or ","element":"span"},{"style":{"fontStyle":"italic"},"text":"tester","element":"span"},{"text":", for a multi-time process is another multi-time process that consists of an input state ","element":"span"},{"style":{"height":11.2},"width":23,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-15.png","element":"img","alt":" ρ","inline":true,"padRight":true},{"text":"in the first time step and a measurement in the final time step, as shown by the red comb object in Figure ","element":"span"},{"href":"#id-213","text":"6 ","element":"a"},{"text":"for ","element":"span"},{"style":{"height":16},"width":1157.22,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-16.png","element":"img","alt":" r = 4 time steps. Testers are sometimes called “measuring","inline":true,"padRight":true},{"text":"co-strategies”.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Definition 53 ","element":"span"},{"text":"(Comb operators for multi-time quantum tester)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Every multi-time quantum tester with ","element":"span"},{"style":{"height":12.8},"width":106.23,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-17.png","element":"img","alt":" r ∈ N","inline":true,"padRight":true},{"text":"time steps, consisting of input systems ","element":"span"},{"style":{"height":15.2},"width":197.1,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-18.png","element":"img","alt":" B1, . . . , Br","inline":true,"padRight":true},{"text":"and output systems ","element":"span"},{"style":{"height":16},"width":196.36,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-19.png","element":"img","alt":" A1, . . . , Ar","inline":true},{"text":", is represented by a set ","element":"span"},{"style":{"height":18.22},"width":141.54,"height":45.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-20.png","element":"img","alt":" {Er;x}x","inline":true,"padRight":true},{"text":"of positive semi-definite operators, with ","element":"span"},{"style":{"height":26.85},"width":273.24,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-21.png","element":"img","alt":" Er;x ∈ L(H(r)A,B","inline":true},{"text":"), such that ","element":"span"},{"style":{"height":17.35},"width":194.42,"height":43.38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-22.png","element":"img","alt":" �x Er;x =","inline":true},{"style":{"height":21.85},"width":1871.74,"height":54.64,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-23.png","element":"img","alt":"Sr ⊗ 1Br, with Sr ∈ COMB∗r(A1, . . . , Ar; B1, . . . , Br−1). Here, COMB∗r(A1, . . . , Ar; B1, . . . , Br−1) :=","inline":true},{"style":{"height":17.6},"width":675.69,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-24.png","element":"img","alt":"COMBr(∅, B1, . . . , Br−1; A1, . . . , Ar","inline":true},{"text":") is the set of all multi-time processes in which the first input system is trivial; see (","element":"span"},{"href":"#id-169","text":"2.29","element":"a"},{"text":"). A finite set ","element":"span"},{"style":{"height":17.6},"width":109.2,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-25.png","element":"img","alt":" {Tx}x","inline":true,"padRight":true},{"text":"of positive semi-definite operators in L(","element":"span"},{"style":{"height":26.85},"width":257.4,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-26.png","element":"img","alt":"H(r)A,B) defines","inline":true,"padRight":true},{"text":"an ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r","element":"span"},{"text":"-step multi-time quantum tester, with input systems ","element":"span"},{"style":{"height":15.2},"width":197.1,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-27.png","element":"img","alt":" B1, . . . , Br","inline":true,"padRight":true},{"text":"and output systems ","element":"span"},{"style":{"height":16},"width":211.57,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-28.png","element":"img","alt":" A1, . . . , Ar,","inline":true,"padRight":true},{"text":"if and only if there exists ","element":"span"},{"style":{"height":17.6},"width":727.57,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-29.png","element":"img","alt":" Er ∈ COMB∗r(A1, . . . , Ar; B1, . . . , Br−1","inline":true},{"text":") such that ","element":"span"},{"style":{"height":21.86},"width":410.05,"height":54.64,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-30.png","element":"img","alt":" �x Tx = Er ⊗ 1Br. ◀","inline":true}],[{"id":"id-88","style":{"fontWeight":"bold"},"text":"B.2 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Norms","element":"span"}],[{"text":"Norms for multi-time quantum processes have been defined in Refs. [","element":"span"},{"href":"#id-86","referenceIndex":86,"text":"86","element":"a"},{"text":", ","element":"span"},{"href":"#id-122","referenceIndex":106,"text":"106","element":"a"},{"text":"]. Here, we follow the presentation in Ref. [","element":"span"},{"href":"#id-86","referenceIndex":86,"text":"86","element":"a"},{"text":"].","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Definition 54 ","element":"span"},{"text":"(Strategy norm and its dual [","element":"span"},{"href":"#id-86","referenceIndex":86,"text":"86","element":"a"},{"text":"])","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":12.8},"width":124.07,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-31.png","element":"img","alt":" r ∈ N","inline":true},{"text":". ","element":"span"},{"text":"For every Hermitian operator ","element":"span"},{"style":{"height":26.85},"width":235.02,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-32.png","element":"img","alt":"H ∈ L(H(r)A,B","inline":true},{"text":"), we define the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"strategy norm ","element":"span"},{"style":{"height":17.6},"width":115.39,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-33.png","element":"img","alt":" ∥H∥⋄r","inline":true,"padRight":true},{"text":"and its dual ","element":"span"},{"style":{"height":17.6},"width":171.98,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-34.png","element":"img","alt":" ∥H∥∗⋄r as","inline":true}],[{"id":"id-214","style":{"width":"91%"},"width":1716,"height":376,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/72-35.png","element":"img"}],[{"text":"For every Hermiticity-preserving linear map ","element":"span"},{"style":{"height":17.13},"width":76.98,"height":42.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-0.png","element":"img","alt":" N [r]","inline":true,"padRight":true},{"text":": L(","element":"span"},{"style":{"height":17.68},"width":392.42,"height":44.19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-1.png","element":"img","alt":"HA1 ⊗ · · · ⊗ HAr) →","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":16.48},"width":314.86,"height":41.19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-2.png","element":"img","alt":"HB1 ⊗ · · · ⊗ HBr","inline":true},{"text":"), in particular those corresponding to multi-time processes, its strategy ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r","element":"span"},{"text":"-norm is defined via the Choi representation as ","element":"span"},{"style":{"height":20.33},"width":512.67,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-3.png","element":"img","alt":" ∥N [r]∥⋄r := ∥C(N [r])∥⋄r. ◀","inline":true}],[{"text":"The norms ","element":"span"},{"style":{"height":17.6},"width":87.7,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-4.png","element":"img","alt":" ∥·∥⋄r","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":17.6},"width":87.69,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-5.png","element":"img","alt":" ∥·∥∗⋄r","inline":true,"padRight":true},{"text":"are (Hölder) dual to each other, and the proof of this can be found ","element":"span"},{"text":"in Ref. [","element":"span"},{"href":"#id-86","referenceIndex":86,"text":"86","element":"a"},{"text":"]. These norms should be thought of as generalizations of the trace norm and its dual (the spectral/operator norm) for quantum states and the diamond norm and its dual for quantum channels. Indeed, with respect to Figure ","element":"span"},{"href":"#id-213","text":"6","element":"a"},{"text":", in the case ","element":"span"},{"style":{"height":17.68},"width":839.46,"height":44.19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-6.png","element":"img","alt":" r = 1 and dA1 = 1, it holds that ∥·∥⋄1 ≡ ∥·∥1","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":17.63},"width":239.56,"height":44.08,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-7.png","element":"img","alt":" ∥·∥∗⋄1 ≡ ∥·∥∞","inline":true},{"text":", where we note that","element":"span"}],[{"id":"id-215","style":{"width":"85%"},"width":1600,"height":375,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-8.png","element":"img"}],[{"text":"for every Hermitian operator ","element":"span"},{"style":{"fontStyle":"italic"},"text":"H","element":"span"},{"text":". Similarly, in the case ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r ","element":"span"},{"text":"= 1 and arbitrary dimension for the system ","element":"span"},{"style":{"height":15.42},"width":49.73,"height":38.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-9.png","element":"img","alt":" A1","inline":true},{"text":", we have that ","element":"span"},{"style":{"height":17.63},"width":549.76,"height":44.08,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-10.png","element":"img","alt":" ∥·∥⋄1 ≡ ∥·∥⋄. The norm ∥·∥∗⋄1","inline":true},{"text":", i.e., the Hölder dual to the diamond norm, ","element":"span"},{"text":"has been considered before in Ref. [","element":"span"},{"href":"#id-120","referenceIndex":105,"text":"105","element":"a"},{"text":", Section 5.3]. This dual norm is the relevant norm when considering observables for quantum channels, analogous to the role that the spectral norm ","element":"span"},{"style":{"height":17.6},"width":89.76,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-11.png","element":"img","alt":" ∥·∥∞","inline":true,"padRight":true},{"text":"has for observables for states. Using (","element":"span"},{"href":"#id-214","text":"B.14","element":"a"},{"text":"), it is straightforward to see that the diamond norm dual is given by the following primal-dual pair of semi-definite programs, where ","element":"span"},{"style":{"height":12.8},"width":80.97,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-12.png","element":"img","alt":" H ∈","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":14.7},"width":180.43,"height":36.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-13.png","element":"img","alt":"HA ⊗ HB","inline":true},{"text":") is Hermitian:","element":"span"}],[{"style":{"height":29.2},"width":1818.89,"height":73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-14.png","element":"img","alt":"∥H∥∗⋄1 := sup�Tr[H(S0 − S1)] : S0, S1 ∈ L(HA ⊗ HB), S0, S1 ≥ 0, TrB[S0 + S1] ≤ 1A� (B.19)","inline":true}],[{"id":"id-216","style":{"width":"89%"},"width":1682,"height":73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-15.png","element":"img"}],[{"text":"We now show that this norm is multiplicative for tensor-product operators, which is a relevant property when considering channel observables without memory; see Section ","element":"span"},{"href":"#id-52","text":"2.1 ","element":"a"},{"text":"for the relevant background information.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Lemma 55 ","element":"span"},{"text":"(Diamond norm dual for tensor-product operators)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":12.8},"width":81.32,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-16.png","element":"img","alt":" K ∈","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":14.7},"width":61.85,"height":36.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-17.png","element":"img","alt":"HA","inline":true},{"text":") and ","element":"span"},{"style":{"height":12.8},"width":70.83,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-18.png","element":"img","alt":" L ∈","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":14.7},"width":62.85,"height":36.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-19.png","element":"img","alt":"HB","inline":true},{"text":") be Hermitian operators. It holds that ","element":"span"},{"style":{"height":17.64},"width":484.84,"height":44.09,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-20.png","element":"img","alt":" ∥K ⊗ L∥∗⋄1 = ∥K∥1∥L∥∞.","inline":true}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"This follows straightforwardly from semi-definite programming duality. Let ","element":"span"},{"style":{"height":17.2},"width":319.83,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-21.png","element":"img","alt":" M1, M2 ∈ L(HA)","inline":true,"padRight":true},{"text":"be the operators achieving the trace norm ","element":"span"},{"style":{"height":17.6},"width":100.82,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-22.png","element":"img","alt":" ∥K∥1","inline":true},{"text":", as in (","element":"span"},{"href":"#id-215","text":"B.15","element":"a"},{"text":"), and let ","element":"span"},{"style":{"height":16.43},"width":187.11,"height":41.08,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-23.png","element":"img","alt":" M′1, M′2 ∈","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":14.7},"width":62.85,"height":36.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-24.png","element":"img","alt":"HB","inline":true},{"text":") be the ","element":"span"},{"text":"operators achieving the spectral norm ","element":"span"},{"style":{"height":17.6},"width":107.34,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-25.png","element":"img","alt":" ∥L∥∞","inline":true},{"text":", as in (","element":"span"},{"href":"#id-215","text":"B.17","element":"a"},{"text":"). Then, ","element":"span"},{"style":{"height":16.43},"width":278.22,"height":41.08,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-26.png","element":"img","alt":" S0 ≡ M1 ⊗ M′1","inline":true,"padRight":true},{"text":"+ ","element":"span"},{"style":{"height":16.43},"width":174.05,"height":41.08,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-27.png","element":"img","alt":" M2 ⊗ M′2","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":16.44},"width":498.54,"height":41.09,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-28.png","element":"img","alt":"S1 ≡ M1 ⊗ M′2 + M2 ⊗ M′1 ","inline":true,"padRight":true},{"text":"are readily verified to be feasible points in the SDP (","element":"span"},{"href":"#id-216","text":"B.19","element":"a"},{"text":"), which means ","element":"span"},{"text":"that ","element":"span"},{"style":{"height":17.64},"width":962.08,"height":44.09,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-29.png","element":"img","alt":" ∥K ⊗ L∥∗⋄1 ≥ Tr[(K ⊗ L)(S0 − S1)] = ∥K∥1∥L∥∞","inline":true},{"text":". For the reverse inequality, observe that ","element":"span"},{"style":{"height":17.6},"width":572.62,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-30.png","element":"img","alt":"Y ≡ Z∥L∥∞, where Z ∈ L(HA","inline":true},{"text":") achieves the trace norm ","element":"span"},{"style":{"height":17.6},"width":100.82,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-31.png","element":"img","alt":" ∥K∥1","inline":true,"padRight":true},{"text":"according to (","element":"span"},{"href":"#id-215","text":"B.16","element":"a"},{"text":"), is a feasible point in the SDP (","element":"span"},{"href":"#id-216","text":"B.20","element":"a"},{"text":"). Consequently, which this choice of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Y ","element":"span"},{"text":", we obtain ","element":"span"},{"style":{"height":17.63},"width":631.29,"height":44.08,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-32.png","element":"img","alt":" ∥K⊗L∥∗⋄1 ≤ Tr[Y ] = ∥K∥1∥L∥∞.","inline":true,"padRight":true},{"text":"This completes the proof. ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/73-33.png","element":"img","alt":"■","inline":true}],[{"text":"Let ","element":"span"},{"style":{"height":14.62},"width":52.06,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-0.png","element":"img","alt":" N1","inline":true,"padRight":true},{"text":"be the comb operator of a multi-time process with ","element":"span"},{"style":{"height":10.22},"width":36.68,"height":25.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-1.png","element":"img","alt":" r1","inline":true,"padRight":true},{"text":"time steps, and let ","element":"span"},{"style":{"height":14.62},"width":52.06,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-2.png","element":"img","alt":" N2","inline":true,"padRight":true},{"text":"be the comb operator of a multi-time process with ","element":"span"},{"style":{"height":10.22},"width":36.68,"height":25.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-3.png","element":"img","alt":" r2","inline":true,"padRight":true},{"text":"time steps, such that both processes have some compatible input and output Hilbert spaces. The multi-time process resulting from the composition of the two processes is represented by the comb operator ","element":"span"},{"style":{"height":15.2},"width":314.04,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-4.png","element":"img","alt":" N1 ⋆ N2, where ⋆","inline":true,"padRight":true},{"text":"represents the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"link product ","element":"span"},{"text":"[","element":"span"},{"href":"#id-51","referenceIndex":57,"text":"57","element":"a"},{"text":"].","element":"span"}],[{"text":"Just as the diamond norm is submultiplicative with respect to composition of linear maps, so too is the strategy norm submultiplicative with respect to concatenation of comb operators according to the link product. We prove this in our next result.","element":"span"}],[{"id":"id-220","style":{"fontWeight":"bold"},"text":"Proposition 56 ","element":"span"},{"text":"(Submultiplicativity of the strategy norm)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":14.62},"width":52.06,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-5.png","element":"img","alt":" N1","inline":true,"padRight":true},{"text":"be the representation of a multi-time process with ","element":"span"},{"style":{"height":10.22},"width":36.69,"height":25.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-6.png","element":"img","alt":" r1","inline":true,"padRight":true},{"text":"time steps, and let ","element":"span"},{"style":{"height":14.62},"width":52.06,"height":36.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-7.png","element":"img","alt":" N2","inline":true,"padRight":true},{"text":"be the representation of a multi-time process with ","element":"span"},{"style":{"height":10.22},"width":36.68,"height":25.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-8.png","element":"img","alt":"r2","inline":true,"padRight":true},{"text":"time steps. Suppose that the composition of these processes, represented by ","element":"span"},{"style":{"height":14.62},"width":147.32,"height":36.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-9.png","element":"img","alt":" N1 ⋆ N2","inline":true},{"text":", produces a multi-time process with ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r ","element":"span"},{"text":"time steps. Then, it holds that","element":"span"}],[{"style":{"width":"65%"},"width":1229,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-10.png","element":"img"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"This result follows straightforwardly from semi-definite programming duality. In particular, we make use of (","element":"span"},{"href":"#id-214","text":"B.13","element":"a"},{"text":"), which we restate here for convenience as","element":"span"}],[{"id":"id-217","style":{"width":"77%"},"width":1443,"height":73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-11.png","element":"img"}],[{"text":"Now, let (","element":"span"},{"style":{"height":17.6},"width":333.24,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-12.png","element":"img","alt":"t1, P1) and (t2, P2","inline":true},{"text":") be the optimal feasible points corresponding to ","element":"span"},{"style":{"height":17.6},"width":408.13,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-13.png","element":"img","alt":" ∥N1∥⋄r1 and ∥N2∥⋄r2,","inline":true,"padRight":true},{"text":"respectively, meaning that ","element":"span"},{"style":{"height":17.6},"width":255.19,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-14.png","element":"img","alt":" t1 = ∥N1∥⋄r1","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":17.6},"width":255.19,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-15.png","element":"img","alt":" t2 = ∥N2∥⋄r2","inline":true},{"text":". We now show that (","element":"span"},{"style":{"height":15.2},"width":228.65,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-16.png","element":"img","alt":"t1t2, P1 ⋆ P2","inline":true},{"text":") constitutes a feasible point in the SDP in (","element":"span"},{"href":"#id-217","text":"B.22","element":"a"},{"text":") for ","element":"span"},{"style":{"height":17.6},"width":224.77,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-17.png","element":"img","alt":" ∥N1 ⋆ N2∥⋄r","inline":true},{"text":", implying the desired result.","element":"span"}],[{"text":"By definition, we have","element":"span"}],[{"style":{"width":"59%"},"width":1118,"height":111,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-18.png","element":"img"}],[{"text":"The right-most inequalities can be rewritten as ","element":"span"},{"style":{"height":14.62},"width":246.56,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-19.png","element":"img","alt":" t1P1 − N1 ≥","inline":true,"padRight":true},{"text":"0 and ","element":"span"},{"style":{"height":14.62},"width":246.57,"height":36.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-20.png","element":"img","alt":" t2P2 − N2 ≥","inline":true,"padRight":true},{"text":"0. Using the fact that the link product of positive semi-definite operators is positive semi-definite (see Ref. [","element":"span"},{"href":"#id-51","referenceIndex":57,"text":"57","element":"a"},{"text":", Theorem 2]), we obtain (","element":"span"},{"style":{"height":17.2},"width":486.06,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-21.png","element":"img","alt":"t1P1 −N1)⋆(t2P2 −N2) ≥","inline":true,"padRight":true},{"text":"0. Expanding the left-hand side of this inequality, we have","element":"span"}],[{"style":{"width":"76%"},"width":1437,"height":177,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-22.png","element":"img"}],[{"text":"Similarly, we have","element":"span"}],[{"id":"id-218","style":{"width":"76%"},"width":1430,"height":328,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/74-23.png","element":"img"}],[{"id":"id-219","style":{"width":"76%"},"width":1430,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-0.png","element":"img"}],[{"text":"Adding the inequalities in (","element":"span"},{"href":"#id-218","text":"B.30","element":"a"},{"text":") and (","element":"span"},{"href":"#id-219","text":"B.33","element":"a"},{"text":"), we obtain","element":"span"}],[{"style":{"width":"76%"},"width":1440,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-1.png","element":"img"}],[{"text":"As ","element":"span"},{"style":{"height":14.62},"width":133.13,"height":36.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-2.png","element":"img","alt":" P1 ⋆ P2","inline":true,"padRight":true},{"text":"is positive semi-definite and defines a multi-time process with ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r ","element":"span"},{"text":"steps, we conclude that (","element":"span"},{"style":{"height":15.2},"width":223.27,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-3.png","element":"img","alt":"t1t2, P1 ⋆ P2","inline":true},{"text":") is a feasible point in the SDP in (","element":"span"},{"href":"#id-217","text":"B.22","element":"a"},{"text":") for ","element":"span"},{"style":{"height":17.6},"width":226.1,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-4.png","element":"img","alt":" ∥N1 ⋆ N2∥⋄r","inline":true},{"text":", which implies the desired result. ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-5.png","element":"img","alt":"■","inline":true}],[{"id":"id-162","style":{"fontWeight":"bold"},"text":"Corollary 57 ","element":"span"},{"text":"(Subadditivity of the strategy norm under composition)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":15.2},"width":132.72,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-6.png","element":"img","alt":" N1, M1","inline":true,"padRight":true},{"text":"be representations of multi-time processes with ","element":"span"},{"style":{"height":10.22},"width":36.68,"height":25.55,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-7.png","element":"img","alt":" r1","inline":true,"padRight":true},{"text":"time steps, and let ","element":"span"},{"style":{"height":15.2},"width":132.72,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-8.png","element":"img","alt":" N2, M2","inline":true,"padRight":true},{"text":"be representations of multi-time processes with ","element":"span"},{"style":{"height":10.22},"width":36.69,"height":25.54,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-9.png","element":"img","alt":" r2","inline":true,"padRight":true},{"text":"time steps. Suppose that the composition of ","element":"span"},{"style":{"height":15.2},"width":715.05,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-10.png","element":"img","alt":" N1 with N2 and M1 with M2 produces","inline":true,"padRight":true},{"text":"multi-time processes with ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r ","element":"span"},{"text":"time steps. Then,","element":"span"}],[{"style":{"width":"78%"},"width":1479,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-11.png","element":"img"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"By the triangle inequality, and making use of Proposition ","element":"span"},{"href":"#id-220","text":"56","element":"a"},{"text":", we have","element":"span"}],[{"style":{"width":"88%"},"width":1655,"height":243,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-12.png","element":"img"}],[{"text":"as required. Here, the last step used Proposition ","element":"span"},{"href":"#id-220","text":"56 ","element":"a"},{"text":"together with the fact that ","element":"span"},{"style":{"height":17.6},"width":381.25,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-13.png","element":"img","alt":" ∥N1∥r1, ∥M2∥r2 ≤ 1.","inline":true,"padRight":true},{"text":"The latter can for example easily be seen from Equation (","element":"span"},{"href":"#id-217","text":"B.22","element":"a"},{"text":"). ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-14.png","element":"img","alt":"■","inline":true}]]},{"heading":"C Pauli-twirl of quantum channels","paragraphs":[[{"text":"In this section, we prove (","element":"span"},{"href":"#id-221","text":"2.18","element":"a"},{"text":"), and thereby prove that the error-rate vector of the Pauli-twirled version of an arbitrary quantum channel ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":": L(","element":"span"},{"style":{"height":19.54},"width":124.26,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-15.png","element":"img","alt":"Cd) →","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":19.54},"width":355.95,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-16.png","element":"img","alt":"Cd), d ∈ {2, 3, . . . }","inline":true},{"text":", is given by ","element":"span"},{"style":{"fontStyle":"italic"},"text":"p","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"z, x","element":"span"},{"text":") =","element":"span"}],[{"style":{"height":20.02},"width":256.07,"height":50.05,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-17.png","element":"img","alt":"dTr[Φz,xC(N","inline":true},{"text":")] for all ","element":"span"},{"style":{"height":17.6},"width":421.6,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-18.png","element":"img","alt":" z, x ∈ {0, 1, . . . , d − 1}","inline":true},{"text":". For generality, we prove the result in terms of qudit ","element":"span"},{"text":"Pauli channels (see Appendix ","element":"span"},{"text":"A","element":"span"},{"text":"), but an analogous proof to the one we present holds when the qudit Pauli operators are replaced by the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"n","element":"span"},{"text":"-qubit Pauli operators.","element":"span"}],[{"id":"id-202","style":{"fontWeight":"bold"},"text":"Lemma 58. ","element":"span"},{"text":"For every qudit quantum channel ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":": L(","element":"span"},{"style":{"height":19.53},"width":131.36,"height":48.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-19.png","element":"img","alt":"Cd) →","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":19.53},"width":375.74,"height":48.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-20.png","element":"img","alt":"Cd), d ∈ {2, 3, . . . }","inline":true},{"text":", the Choi representation of its Pauli-twirled version ","element":"span"},{"style":{"height":16.33},"width":74.23,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-21.png","element":"img","alt":" N W","inline":true},{"text":", as defined in (","element":"span"},{"href":"#id-222","text":"A.7","element":"a"},{"text":"), is given by","element":"span"}],[{"style":{"width":"77%"},"width":1457,"height":128,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-22.png","element":"img"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"By definition of the Choi representation, we have","element":"span"}],[{"style":{"width":"94%"},"width":1770,"height":52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/75-23.png","element":"img"}],[{"style":{"width":"80%"},"width":1507,"height":582,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-0.png","element":"img"}],[{"text":"as required, where for the third equality we have used the “transpose trick” (","element":"span"},{"style":{"height":21.85},"width":489.42,"height":54.63,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-1.png","element":"img","alt":"1d⊗X)|Γ⟩ = (XT⊗1d)|Γ⟩,","inline":true,"padRight":true},{"text":"which holds for every linear operator ","element":"span"},{"style":{"height":19.53},"width":282.82,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-2.png","element":"img","alt":" X ∈ L(Cd). ■","inline":true}],[{"id":"id-203","style":{"fontWeight":"bold"},"text":"Proposition 59. ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":19.53},"width":592.37,"height":48.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-3.png","element":"img","alt":" X ∈ L(Cd ⊗ Cd), d ∈ {2, 3, . . . }","inline":true},{"text":", and define ","element":"span"},{"style":{"height":15.25},"width":58.42,"height":38.12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-4.png","element":"img","alt":" SW","inline":true,"padRight":true},{"text":"to be the pinching channel in the qudit Bell basis:","element":"span"}],[{"id":"id-226","style":{"width":"70%"},"width":1321,"height":128,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-5.png","element":"img"}],[{"text":"It holds that","element":"span"}],[{"style":{"width":"75%"},"width":1418,"height":119,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-6.png","element":"img"}],[{"text":"Consequently, for every channel ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":": L(","element":"span"},{"style":{"height":19.53},"width":124.8,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-7.png","element":"img","alt":"Cd) →","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":15.53},"width":49.51,"height":38.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-8.png","element":"img","alt":"Cd","inline":true},{"text":"), the Choi representation of its Pauli-twirled version ","element":"span"},{"style":{"height":16.34},"width":74.23,"height":40.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-9.png","element":"img","alt":" N W","inline":true},{"text":", defined in (","element":"span"},{"href":"#id-222","text":"A.7","element":"a"},{"text":"), is given by ","element":"span"},{"style":{"height":19.54},"width":380.79,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-10.png","element":"img","alt":" C(N W) = SW(C(N","inline":true},{"text":")). Furthermore, the corresponding error-rate vector of the Pauli-twirled channel is given by ","element":"span"},{"style":{"height":21.29},"width":451.94,"height":53.22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-11.png","element":"img","alt":" p(z, x) = 1dTr[Φz,xC(N","inline":true},{"text":")] for all ","element":"span"},{"style":{"height":12.8},"width":111.7,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-12.png","element":"img","alt":" z, x ∈","inline":true},{"style":{"height":17.6},"width":313.5,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-13.png","element":"img","alt":"{0, 1, . . . , d − 1}.","inline":true}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"We make repeated use of the following facts about the qudit Pauli operators [","element":"span"},{"href":"#id-110","referenceIndex":101,"text":"101","element":"a"},{"text":"]:","element":"span"}],[{"id":"id-224","style":{"width":"66%"},"width":1249,"height":224,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-14.png","element":"img"}],[{"text":"which hold for all choices of ","element":"span"},{"style":{"height":17.6},"width":529.86,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-15.png","element":"img","alt":" z, x, z′, x′ ∈ {0, 1, . . . , d − 1}","inline":true},{"text":". Now, let us start by showing that","element":"span"}],[{"id":"id-225","style":{"width":"72%"},"width":1350,"height":130,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-16.png","element":"img"}],[{"text":"for all ","element":"span"},{"style":{"height":17.6},"width":423.54,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-17.png","element":"img","alt":" z, x ∈ {0, 1, . . . , d − 1}","inline":true},{"text":". To show this, we use the definition of Φ","element":"span"},{"style":{"height":8.4},"width":45.49,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-18.png","element":"img","alt":"z,x","inline":true,"padRight":true},{"text":"in (","element":"span"},{"href":"#id-223","text":"A.3","element":"a"},{"text":"), the properties in (","element":"span"},{"href":"#id-224","text":"C.5","element":"a"},{"text":")–(","element":"span"},{"href":"#id-224","text":"C.7","element":"a"},{"text":"), and the fact that the operators ","element":"span"},{"style":{"height":17.64},"width":965.93,"height":44.09,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-19.png","element":"img","alt":" {W z′1,x′1 ⊗ W z′2,x′2 : z′1, x′1, z′2, x′2 ∈ {0, 1, . . . , d − 1}}","inline":true,"padRight":true},{"text":"form a basis for L(","element":"span"},{"style":{"height":19.53},"width":184.18,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-20.png","element":"img","alt":"Cd ⊗ Cd),","inline":true}],[{"style":{"width":"80%"},"width":1511,"height":96,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/76-21.png","element":"img"}],[{"style":{"width":"57%"},"width":1085,"height":582,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/77-0.png","element":"img"}],[{"text":"Therefore, using (","element":"span"},{"href":"#id-225","text":"C.8","element":"a"},{"text":"), along with the properties in (","element":"span"},{"href":"#id-224","text":"C.5","element":"a"},{"text":")–(","element":"span"},{"href":"#id-224","text":"C.7","element":"a"},{"text":") once more, we obtain","element":"span"}],[{"style":{"width":"3%"},"width":58,"height":24,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/77-1.png","element":"img"}],[{"style":{"height":17.6},"width":180.42,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/77-2.png","element":"img","alt":"SW(X) =","inline":true}],[{"style":{"width":"95%"},"width":1796,"height":1016,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/77-3.png","element":"img"}],[{"text":"which proves (","element":"span"},{"href":"#id-226","text":"C.3","element":"a"},{"text":"). Then, if ","element":"span"},{"style":{"height":18.4},"width":206.49,"height":46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/77-4.png","element":"img","alt":" X ≡ C(N","inline":true},{"text":") is the Choi representation of a quantum channel ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":": L(","element":"span"},{"style":{"height":19.54},"width":124.14,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/77-5.png","element":"img","alt":"Cd) →","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":15.54},"width":49.52,"height":38.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/77-6.png","element":"img","alt":"Cd","inline":true},{"text":"), using Lemma ","element":"span"},{"href":"#id-202","text":"58 ","element":"a"},{"text":"we see that ","element":"span"},{"style":{"height":23.7},"width":841.52,"height":59.24,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/77-7.png","element":"img","alt":" C(N W) = SW(C(N)) = �d−1z,x=0 Tr[Φz,xC(N","inline":true},{"text":")]Φ","element":"span"},{"style":{"height":11.94},"width":93.73,"height":29.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/77-8.png","element":"img","alt":"z,x =","inline":true},{"style":{"height":23.7},"width":402.07,"height":59.24,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/77-9.png","element":"img","alt":"�d−1z,x=01dTr[Φz,xC(N","inline":true},{"text":")]Γ","element":"span"},{"style":{"height":8.4},"width":45.5,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/77-10.png","element":"img","alt":"z,x","inline":true},{"text":". ","element":"span"},{"text":"By identifying with (","element":"span"},{"href":"#id-204","text":"A.5","element":"a"},{"text":"), we can see that the twirled channel is indeed a Pauli channel, and using (","element":"span"},{"href":"#id-227","text":"A.6","element":"a"},{"text":"), we see that the error-rate vector of the twirled channel is ","element":"span"},{"style":{"height":21.29},"width":1030.11,"height":53.23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/77-11.png","element":"img","alt":"p(z, x) = 1dTr[Φz,xC(N)] for all z, x ∈ {0, 1, . . . , d − 1}","inline":true},{"text":". This completes the proof. ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/77-12.png","element":"img","alt":"■","inline":true}]]},{"heading":"D Entropic analysis of the MMW algorithm","paragraphs":[[{"text":"In this section, we provide an entropic analysis of the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"matrix multiplicative weights ","element":"span"},{"text":"(MMW) algorithm (Algorithm ","element":"span"},{"href":"#id-134","text":"2 ","element":"a"},{"text":"and its projected variant for Choi states (Algorithm ","element":"span"},{"href":"#id-144","text":"3","element":"a"},{"text":"). We note that an entropic ","element":"span"},{"text":"analysis similar to the one we provide here can be found in Ref. [","element":"span"},{"href":"#id-84","referenceIndex":84,"text":"84","element":"a"},{"text":", Theorem 2.4] for the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"multiplicative weights update ","element":"span"},{"text":"(MWU) algorithm. On it highest level, the MMW algorithm assigns initial weights to experts iteratively updates these weights multiplicatively according to the feedback on how well the expert has performed. It is known as a method for highly efficiently solve convex optimization problems.","element":"span"}],[{"text":"We start with a proof of Proposition ","element":"span"},{"href":"#id-139","text":"14","element":"a"},{"text":". The bound we obtain applies also to the Hedge algorithm (Algorithm ","element":"span"},{"href":"#id-228","text":"5 ","element":"a"},{"text":"below), which is a special case of the MMW algorithm (Algorithm ","element":"span"},{"href":"#id-134","text":"2","element":"a"},{"text":") when the loss matrices ","element":"span"},{"style":{"height":15.93},"width":68.1,"height":39.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/78-0.png","element":"img","alt":" L(t)","inline":true,"padRight":true},{"text":"of the MMW algorithm are all diagonal in the same basis, such that the diagonal elements of ","element":"span"},{"style":{"height":15.93},"width":68.1,"height":39.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/78-1.png","element":"img","alt":" L(t)","inline":true,"padRight":true},{"text":"constitute the loss vector ","element":"span"},{"style":{"height":16.33},"width":83.46,"height":40.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/78-2.png","element":"img","alt":" m(t)","inline":true,"padRight":true},{"text":"in the Hedge algorithm. The bound we obtain for the Hedge algorithm is in general tighter than the one obtained in Ref. [","element":"span"},{"href":"#id-84","referenceIndex":84,"text":"84","element":"a"},{"text":", Theorem 2.3].","element":"span"}],[{"id":"id-228","style":{"width":"99%"},"width":1873,"height":429,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/78-3.png","element":"img"}],[{"id":"id-108","style":{"fontWeight":"bold"},"text":"D.1 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Proof of Proposition ","element":"span"},{"href":"#id-139","style":{"fontWeight":"bold"},"text":"14","element":"a"}],[{"text":"We start by noticing that in the proof of [","element":"span"},{"href":"#id-131","referenceIndex":110,"text":"110","element":"a"},{"text":", Theorem 3.1], the inequality","element":"span"}],[{"style":{"width":"74%"},"width":1401,"height":59,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/78-4.png","element":"img"}],[{"text":"can be written as","element":"span"}],[{"style":{"width":"70%"},"width":1330,"height":100,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/78-5.png","element":"img"}],[{"text":"Taking the logarithm on both sides leads to","element":"span"}],[{"id":"id-229","style":{"width":"82%"},"width":1552,"height":53,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/78-6.png","element":"img"}],[{"text":"Applying (","element":"span"},{"href":"#id-229","text":"D.3","element":"a"},{"text":") recursively leads to","element":"span"}],[{"style":{"width":"92%"},"width":1735,"height":348,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/78-7.png","element":"img"}],[{"text":"Now, let ","element":"span"},{"style":{"height":11.2},"width":23,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/78-8.png","element":"img","alt":" ρ","inline":true,"padRight":true},{"text":"be an arbitrary density operator. Then, noting that the von Neumann entropy of ","element":"span"},{"style":{"height":11.2},"width":23,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/78-9.png","element":"img","alt":" ρ","inline":true,"padRight":true},{"text":"is given by ","element":"span"},{"style":{"height":17.6},"width":371.38,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/78-10.png","element":"img","alt":" H(ρ) := −Tr[ρ log ρ","inline":true},{"text":"], we have that the relative entropy between ","element":"span"},{"style":{"height":19.53},"width":269.06,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/78-11.png","element":"img","alt":" ρ and ω(t) is10","inline":true}],[{"style":{"width":"75%"},"width":1418,"height":53,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/78-12.png","element":"img"}],[{"style":{"width":"40%"},"width":762,"height":263,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/79-0.png","element":"img"}],[{"text":"Noting further that","element":"span"}],[{"style":{"width":"67%"},"width":1265,"height":482,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/79-1.png","element":"img"}],[{"text":"we obtain","element":"span"}],[{"id":"id-240","style":{"width":"91%"},"width":1723,"height":276,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/79-2.png","element":"img"}],[{"text":"where for the inequality on the last line we have used (","element":"span"},{"href":"#id-229","text":"D.3","element":"a"},{"text":"). This implies that","element":"span"}],[{"style":{"width":"93%"},"width":1752,"height":348,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/79-3.png","element":"img"}],[{"text":"Then, because ","element":"span"},{"style":{"height":20.33},"width":286.55,"height":50.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/79-4.png","element":"img","alt":" D(ρ∥ω(T+1)) ≥","inline":true,"padRight":true},{"text":"0, we obtain","element":"span"}],[{"style":{"width":"98%"},"width":1837,"height":122,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/79-5.png","element":"img"}],[{"text":"Finally, because ","element":"span"},{"style":{"height":23.47},"width":185.72,"height":58.68,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/79-6.png","element":"img","alt":" ω(1) = 1d1","inline":true},{"text":", and because ","element":"span"},{"style":{"height":23.47},"width":447.03,"height":58.68,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/79-7.png","element":"img","alt":" D(ρ∥ 1d1) = log d − H(ρ","inline":true},{"text":"), we can rearrange the inequality ","element":"span"},{"text":"above to obtain the desired result.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Remark 60. ","element":"span"},{"text":"Let us make the following observations about the result in (","element":"span"},{"href":"#id-136","text":"2.44","element":"a"},{"text":").","element":"span"}],[{"text":"• If we let ","element":"span"},{"style":{"height":11.2},"width":23,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/79-8.png","element":"img","alt":" ρ","inline":true,"padRight":true},{"text":"be a rank-one density operator, then ","element":"span"},{"style":{"height":17.2},"width":845.78,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/79-9.png","element":"img","alt":" H(ρ) = 0, and then we can further minimize","inline":true,"padRight":true},{"text":"over all such rank-one density operators to obtain","element":"span"}],[{"style":{"width":"78%"},"width":1474,"height":122,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/79-10.png","element":"img"}],[{"style":{"width":"56%"},"width":1052,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-0.png","element":"img"}],[{"text":"• It is worth noting that the MMW-based result for online learning of quantum states in Ref. [","element":"span"},{"href":"#id-9","referenceIndex":20,"text":"20","element":"a"},{"text":", Theorem 4] (see, in particular, the proof) presents the regret bound (using the notation of this section, and ","element":"span"},{"style":{"height":17.6},"width":142.26,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-1.png","element":"img","alt":" d = 2n)","inline":true}],[{"style":{"width":"80%"},"width":1517,"height":122,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-2.png","element":"img"}],[{"text":"for every density operator ","element":"span"},{"style":{"height":11.2},"width":23,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-3.png","element":"img","alt":" ρ","inline":true},{"text":". Note that the regret bound in (","element":"span"},{"href":"#id-136","text":"2.44","element":"a"},{"text":") can in general be tighter than this bound, because of the entropy term ","element":"span"},{"href":"#id-136","style":{"height":17.6},"width":256.88,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-4.png","element":"img","alt":" H(ρ) in (2.44","inline":true},{"text":"), which is always non-negative.","element":"span"}],[{"text":"• We can minimize the right-hand side of (","element":"span"},{"href":"#id-136","text":"2.44","element":"a"},{"text":") with respect to ","element":"span"},{"style":{"height":11.2},"width":23,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-5.png","element":"img","alt":" ρ","inline":true},{"text":", in order to obtain the best possible upper bound on the total expected loss. In other words,","element":"span"}],[{"style":{"width":"91%"},"width":1721,"height":144,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-6.png","element":"img"}],[{"text":"From the connection between the MWW and Hedge algorithms noted at the beginning of this section, we immediately obtain the following regret bound for the Hedge algorithm from Proposition ","element":"span"},{"href":"#id-139","text":"14","element":"a"},{"text":".","element":"span"}],[{"id":"id-138","style":{"fontWeight":"bold"},"text":"Corollary 61 ","element":"span"},{"text":"(Entropic regret bound for the Hedge algorithm)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":12.8},"width":116.9,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-7.png","element":"img","alt":" T ∈ N","inline":true},{"text":", and consider a sequence ","element":"span"},{"style":{"height":19.13},"width":392.66,"height":47.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-8.png","element":"img","alt":"m(1), m(2), . . . , m(T)","inline":true,"padRight":true},{"text":"of loss vectors of size ","element":"span"},{"style":{"height":17.6},"width":261.12,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-9.png","element":"img","alt":" d ∈ {2, 3, . . . }","inline":true,"padRight":true},{"text":"along with the updates ","element":"span"},{"style":{"height":19.93},"width":64.63,"height":49.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-10.png","element":"img","alt":" p(t)","inline":true,"padRight":true},{"text":"provided by the Hedge algorithm in Algorithm ","element":"span"},{"href":"#id-228","text":"5","element":"a"},{"text":". Then, the following inequality holds:","element":"span"}],[{"style":{"width":"83%"},"width":1574,"height":122,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-11.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"fontStyle":"italic","fontWeight":"bold"},"text":"q ","element":"span"},{"text":"is an arbitrary probability vector.","element":"span"}],[{"id":"id-109","style":{"fontWeight":"bold"},"text":"D.2 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"The projected MMW algorithm","element":"span"}],[{"text":"We start by defining the projection map as","element":"span"}],[{"id":"id-232","style":{"width":"82%"},"width":1552,"height":102,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-12.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":13.1},"width":85.72,"height":32.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-13.png","element":"img","alt":" σA,B","inline":true,"padRight":true},{"text":"is a density operator and the relative entropy is ","element":"span"},{"style":{"height":17.6},"width":100.58,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-14.png","element":"img","alt":" D(·∥·","inline":true},{"text":") is defined in (","element":"span"},{"href":"#id-230","text":"2.51","element":"a"},{"text":"). We make use of the fact that the relative entropy is a Bregman divergence [","element":"span"},{"href":"#id-135","referenceIndex":113,"text":"113","element":"a"},{"text":", ","element":"span"},{"href":"#id-145","referenceIndex":115,"text":"115","element":"a"},{"text":", ","element":"span"},{"href":"#id-231","referenceIndex":144,"text":"144","element":"a"},{"text":"]. In particular, for ","element":"span"},{"style":{"height":11.2},"width":66.98,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-15.png","element":"img","alt":"ρ, σ","inline":true,"padRight":true},{"text":"density operators, with ","element":"span"},{"style":{"height":7.6},"width":25,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-16.png","element":"img","alt":" σ","inline":true,"padRight":true},{"text":"positive definite, we have that","element":"span"}],[{"style":{"width":"78%"},"width":1473,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-17.png","element":"img"}],[{"text":"where","element":"span"}],[{"id":"id-236","style":{"width":"61%"},"width":1156,"height":45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/80-18.png","element":"img"}],[{"id":"id-234","style":{"width":"63%"},"width":1192,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/81-0.png","element":"img"}],[{"text":"for ","element":"span"},{"style":{"fontStyle":"italic"},"text":"P ","element":"span"},{"text":"positive semi-definite and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Q ","element":"span"},{"text":"positive definite. It follows that the projection map in (","element":"span"},{"href":"#id-232","text":"D.14","element":"a"},{"text":") is a Bregman projection; consequently, we have the so-called ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Pythagorean inequality ","element":"span"},{"text":"[","element":"span"},{"href":"#id-231","referenceIndex":144,"text":"144","element":"a"},{"text":"],","element":"span"}],[{"id":"id-233","style":{"width":"68%"},"width":1279,"height":45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/81-1.png","element":"img"}],[{"text":"for density operators ","element":"span"},{"style":{"height":15.6},"width":146.57,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/81-2.png","element":"img","alt":" ρ and σ","inline":true},{"text":". This inequality essentially tells us that projection only get us closer to the set of CPTP maps, in the sense that","element":"span"}],[{"id":"id-241","style":{"width":"61%"},"width":1146,"height":45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/81-3.png","element":"img"}],[{"text":"which follows directly from (","element":"span"},{"href":"#id-233","text":"D.18","element":"a"},{"text":"), due to the fact that ","element":"span"},{"style":{"height":17.6},"width":187.94,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/81-4.png","element":"img","alt":" D(ρ∥σ) ≥","inline":true,"padRight":true},{"text":"0 for all density operators ","element":"span"},{"style":{"height":15.2},"width":106.73,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/81-5.png","element":"img","alt":" ρ and","inline":true},{"style":{"height":7.6},"width":25,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/81-6.png","element":"img","alt":"σ","inline":true},{"text":". We also require the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Pinsker inequality ","element":"span"},{"text":"[","element":"span"},{"href":"#id-110","referenceIndex":101,"text":"101","element":"a"},{"text":"]: for all density operators ","element":"span"},{"style":{"height":15.6},"width":160.46,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/81-7.png","element":"img","alt":" ρ and σ,","inline":true}],[{"id":"id-242","style":{"width":"60%"},"width":1140,"height":89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/81-8.png","element":"img"}],[{"text":"Finally, we make the observation that the update step 4 in the MMW algorithm (Algorithm ","element":"span"},{"href":"#id-134","text":"2","element":"a"}],[{"text":"and Algorithm ","element":"span"},{"href":"#id-144","text":"3","element":"a"},{"text":") can be written as","element":"span"}],[{"style":{"width":"80%"},"width":1505,"height":73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/81-9.png","element":"img"}],[{"text":"where we recall the expression for ","element":"span"},{"href":"#id-234","style":{"height":17.6},"width":250.72,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/81-10.png","element":"img","alt":" ∇F in (D.17)","inline":true},{"text":". With this observation, we can equivalently formulate Algorithm ","element":"span"},{"href":"#id-144","text":"3 ","element":"a"},{"text":"as the lazy version of Algorithm ","element":"span"},{"href":"#id-235","text":"6 ","element":"a"},{"text":"below, which is a ","element":"span"},{"style":{"fontStyle":"italic"},"text":"mirror descent ","element":"span"},{"text":"algorithm [","element":"span"},{"href":"#id-28","referenceIndex":46,"text":"46","element":"a"},{"text":", Section 5.3].","element":"span"}],[{"id":"id-235","style":{"width":"99%"},"width":1873,"height":863,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/81-11.png","element":"img"}],[{"id":"id-244","style":{"fontWeight":"bold"},"text":"Proposition 62 ","element":"span"},{"text":"(Regret bound for lazy mirror descent for Choi states)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":13.1},"width":85.72,"height":32.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/81-12.png","element":"img","alt":" σA,B","inline":true,"padRight":true},{"text":"be an arbitrary Choi state. Let ","element":"span"},{"style":{"height":12.8},"width":128.24,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/81-13.png","element":"img","alt":" T ∈ N","inline":true,"padRight":true},{"text":"be the number of rounds of interaction, and consider a sequence ","element":"span"},{"style":{"height":26.85},"width":394.02,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/81-14.png","element":"img","alt":"L(1)A,B, L(2)A,B, . . . , L(T)A,B ","inline":true,"padRight":true},{"text":"of cost matrices along with the updates ","element":"span"},{"style":{"height":26.85},"width":83.34,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/81-15.png","element":"img","alt":" ρ(t)A,B ","inline":true,"padRight":true},{"text":"provided by the lazy version of ","element":"span"},{"text":"Algorithm ","element":"span"},{"href":"#id-235","text":"6","element":"a"},{"text":". Then,","element":"span"}],[{"style":{"width":"95%"},"width":1793,"height":122,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/81-16.png","element":"img"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"It turns out that the lazy version of Algorithm ","element":"span"},{"href":"#id-235","text":"6 ","element":"a"},{"text":"is equivalent to the ","element":"span"},{"style":{"fontStyle":"italic"},"text":"regularized follow-the-leader ","element":"span"},{"text":"algorithm [","element":"span"},{"href":"#id-28","referenceIndex":46,"text":"46","element":"a"},{"text":", Section 5.3.1]. In particular, using Ref. [","element":"span"},{"href":"#id-28","referenceIndex":46,"text":"46","element":"a"},{"text":", Lemma 5.5], it follows that the projection step for the lazy version of Algorithm ","element":"span"},{"href":"#id-235","text":"6 ","element":"a"},{"text":"is given by","element":"span"}],[{"id":"id-239","style":{"width":"92%"},"width":1727,"height":206,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/82-0.png","element":"img"}],[{"text":"where the function ","element":"span"},{"style":{"fontStyle":"italic"},"text":"F ","element":"span"},{"text":"is defined in (","element":"span"},{"href":"#id-236","text":"D.16","element":"a"},{"text":"). Indeed, the gradient of the objective function on the right-hand side is equal to","element":"span"}],[{"id":"id-237","style":{"width":"82%"},"width":1540,"height":122,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/82-1.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":17.5},"width":88.8,"height":43.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/82-2.png","element":"img","alt":" PA,B","inline":true,"padRight":true},{"text":"is an arbitrary positive semi-definite operator. At the same time, let us observe that the gradient step of the lazy version of Algorithm ","element":"span"},{"href":"#id-235","text":"6 ","element":"a"},{"text":"is given by","element":"span"}],[{"style":{"width":"73%"},"width":1380,"height":523,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/82-3.png","element":"img"}],[{"text":"where the last equality holds because ","element":"span"},{"style":{"height":26.85},"width":972.79,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/82-4.png","element":"img","alt":" W (1)A,B = 1A,B and log(1A,B) = 0. This implies that","inline":true}],[{"style":{"width":"76%"},"width":1435,"height":370,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/82-5.png","element":"img"}],[{"text":"Therefore,","element":"span"}],[{"style":{"width":"88%"},"width":1657,"height":353,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/82-6.png","element":"img"}],[{"style":{"width":"60%"},"width":1133,"height":121,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-0.png","element":"img"}],[{"text":"which implies that","element":"span"}],[{"id":"id-238","style":{"width":"74%"},"width":1392,"height":119,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-1.png","element":"img"}],[{"text":"Combining (","element":"span"},{"href":"#id-237","text":"D.24","element":"a"},{"text":") and (","element":"span"},{"href":"#id-238","text":"D.28","element":"a"},{"text":"), and using the fact that the function ","element":"span"},{"style":{"fontStyle":"italic"},"text":"F ","element":"span"},{"text":"is strictly convex, we can conclude that (","element":"span"},{"href":"#id-239","text":"D.23","element":"a"},{"text":") holds. The desired regret bound then follows by Ref. [","element":"span"},{"href":"#id-9","referenceIndex":20,"text":"20","element":"a"},{"text":", Theorem 3], which considers the regularized follow-the-leader algorithm for quantum states. ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-2.png","element":"img","alt":"■","inline":true}],[{"id":"id-243","style":{"fontWeight":"bold"},"text":"Proposition 63 ","element":"span"},{"text":"(Regret bound for agile mirror descent for Choi states)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":13.1},"width":85.71,"height":32.76,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-3.png","element":"img","alt":" σA,B","inline":true,"padRight":true},{"text":"be an arbitrary Choi state. Let ","element":"span"},{"style":{"height":12.8},"width":128.24,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-4.png","element":"img","alt":" T ∈ N","inline":true,"padRight":true},{"text":"be the number of rounds of interaction, and consider a sequence ","element":"span"},{"style":{"height":26.85},"width":394.02,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-5.png","element":"img","alt":"L(1)A,B, L(2)A,B, . . . , L(T)A,B ","inline":true,"padRight":true},{"text":"of cost matrices along with the updates ","element":"span"},{"style":{"height":26.85},"width":83.34,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-6.png","element":"img","alt":" ρ(t)A,B ","inline":true,"padRight":true},{"text":"provided by the agile version of ","element":"span"},{"text":"Algorithm ","element":"span"},{"href":"#id-235","text":"6","element":"a"},{"text":". Then,","element":"span"}],[{"id":"id-246","style":{"width":"95%"},"width":1787,"height":122,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-7.png","element":"img"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Similar to the proof of Proposition ","element":"span"},{"href":"#id-139","text":"14 ","element":"a"},{"text":"(see (","element":"span"},{"href":"#id-240","text":"D.7","element":"a"},{"text":"), in particular), the idea of the proof is to bound ","element":"span"},{"style":{"height":26.85},"width":591.23,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-8.png","element":"img","alt":" D(σA,B∥ρ(t)A,B) − D(σA,B∥ρ(t+1)A,B","inline":true,"padRight":true},{"text":") for every time step ","element":"span"},{"style":{"height":17.6},"width":304.77,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-9.png","element":"img","alt":" t ∈ {1, 2, . . . , T}","inline":true},{"text":". To this end, we start ","element":"span"},{"text":"by noting that from the gradient step of the agile version of Algorithm ","element":"span"},{"href":"#id-235","text":"6","element":"a"},{"text":", it holds that","element":"span"}],[{"style":{"width":"67%"},"width":1264,"height":98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-10.png","element":"img"}],[{"text":"for all ","element":"span"},{"style":{"height":17.6},"width":303.6,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-11.png","element":"img","alt":" t ∈ {1, 2, . . . , T}","inline":true},{"text":". Using this, and with straightforward manipulations, we obtain the following:","element":"span"}],[{"text":"Tr[","element":"span"},{"style":{"height":26.85},"width":1743.46,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-12.png","element":"img","alt":"L(t)A,Bρ(t)A,B] − Tr[L(t)A,BσA,B] = Tr[L(t)A,B(ρ(t)A,B − σA,B)] (D.31)","inline":true},{"style":{"height":39.38},"width":877.17,"height":98.45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-13.png","element":"img","alt":"= 1ηTr��log ρ(t)A,B − log W (t+1)A,B �(ρ(t)A,B − σA,B)�","inline":true}],[{"style":{"width":"61%"},"width":1144,"height":433,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-14.png","element":"img"}],[{"text":"where for the inequality we have used (","element":"span"},{"href":"#id-241","text":"D.19","element":"a"},{"text":"), and we also made use of the fact that","element":"span"}],[{"style":{"width":"73%"},"width":1374,"height":68,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-15.png","element":"img"}],[{"text":"which means that ","element":"span"},{"style":{"height":26.85},"width":675.16,"height":67.14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-16.png","element":"img","alt":" D(σA,B∥ω(t+1)A,B ) = D(σA,B∥W (t+1)A,B","inline":true,"padRight":true},{"text":") + log(Tr[","element":"span"},{"style":{"height":26.85},"width":128.94,"height":67.14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-17.png","element":"img","alt":"W (t+1)A,B","inline":true,"padRight":true},{"text":"]). ","element":"span"},{"text":"Now, let us bound ","element":"span"},{"style":{"height":26.85},"width":272.98,"height":67.12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-18.png","element":"img","alt":"D(ρ(t)A,B∥ω(t+1)A,B ","inline":true,"padRight":true},{"text":"). Consider that","element":"span"}],[{"style":{"width":"93%"},"width":1756,"height":67,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/83-19.png","element":"img"}],[{"style":{"width":"82%"},"width":1541,"height":466,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-0.png","element":"img"}],[{"text":"where we have used the Hölder inequality in the final line. Let us now use the fact that (","element":"span"},{"style":{"height":19.54},"width":183.15,"height":48.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-1.png","element":"img","alt":"x − y)2 ≥","inline":true,"padRight":true},{"text":"0 ","element":"span"},{"style":{"height":21.29},"width":354.33,"height":53.22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-2.png","element":"img","alt":" ⇒ xy ≤ 12x2 + 12y2","inline":true,"padRight":true},{"text":"for all ","element":"span"},{"style":{"height":15.6},"width":153.5,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-3.png","element":"img","alt":" x, y ∈ R","inline":true},{"text":". Letting ","element":"span"},{"style":{"height":26.85},"width":278.45,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-4.png","element":"img","alt":" x ≡ η∥L(t)A,B∥∞","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":26.85},"width":395.36,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-5.png","element":"img","alt":" y ≡ ∥ρ(t)A,B − ω(t+1)A,B ∥1","inline":true},{"text":", and using ","element":"span"},{"text":"the Pinsker inequality (","element":"span"},{"href":"#id-242","text":"D.20","element":"a"},{"text":"), we obtain","element":"span"}],[{"style":{"width":"85%"},"width":1596,"height":209,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-6.png","element":"img"}],[{"text":"which implies that","element":"span"}],[{"style":{"width":"65%"},"width":1229,"height":93,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-7.png","element":"img"}],[{"text":"for all ","element":"span"},{"style":{"height":17.6},"width":304.47,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-8.png","element":"img","alt":" t ∈ {1, 2, . . . , T}","inline":true},{"text":". Altogether, we have","element":"span"}],[{"style":{"width":"95%"},"width":1783,"height":99,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-9.png","element":"img"}],[{"text":"for all ","element":"span"},{"style":{"height":17.6},"width":304.47,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-10.png","element":"img","alt":" t ∈ {1, 2, . . . , T}","inline":true},{"text":". Summing over all ","element":"span"},{"style":{"fontStyle":"italic"},"text":"t","element":"span"},{"text":", we obtain","element":"span"}],[{"style":{"width":"81%"},"width":1518,"height":411,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-11.png","element":"img"}],[{"text":"where the second inequality is due to the fact that ","element":"span"},{"style":{"height":17.6},"width":189.44,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-12.png","element":"img","alt":" D(ρ∥σ) ≥","inline":true,"padRight":true},{"text":"0 for all density operators ","element":"span"},{"style":{"height":11.2},"width":23,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-13.png","element":"img","alt":" ρ","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":7.6},"width":25,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-14.png","element":"img","alt":" σ","inline":true},{"text":". After substituting ","element":"span"},{"style":{"height":26.96},"width":325.52,"height":67.4,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-15.png","element":"img","alt":" ρ(1)A,B = 1dAdB 1A,B","inline":true},{"text":", we obtain the desired result. ","element":"span"},{"style":{"height":0},"width":22,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-16.png","element":"img","alt":"■","inline":true}],[{"id":"id-159","style":{"fontWeight":"bold"},"text":"Remark 64 ","element":"span"},{"text":"(Extending Proposition ","element":"span"},{"href":"#id-243","text":"63 ","element":"a"},{"text":"to multi-time processes)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"text":"The key elements of the proof of Proposition ","element":"span"},{"href":"#id-244","text":"62 ","element":"a"},{"text":"are the fact that we project onto a convex set in (","element":"span"},{"href":"#id-232","text":"D.14","element":"a"},{"text":"), such that the inequality in (","element":"span"},{"href":"#id-241","text":"D.19","element":"a"},{"text":") holds, and the Pinsker inequality in (","element":"span"},{"href":"#id-242","text":"D.20","element":"a"},{"text":"). ","element":"span"},{"text":"Consequently, it is straightforward to generalize Algorithm ","element":"span"},{"href":"#id-235","text":"6","element":"a"},{"text":", and thus Proposition ","element":"span"},{"href":"#id-244","text":"62","element":"a"},{"text":", to the Choi states of multi-times processes. In particular, letting ","element":"span"},{"style":{"height":17.2},"width":781.62,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-17.png","element":"img","alt":" COMBr ≡ COMBr(A1, . . . , Ar; B1, . . . , Br","inline":true},{"text":") be the set of multi-time processes with ","element":"span"},{"style":{"height":17.6},"width":259.18,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-18.png","element":"img","alt":"r ∈ {1, 2, . . . }","inline":true,"padRight":true},{"text":"time steps, as given in Definition ","element":"span"},{"href":"#id-212","text":"51","element":"a"},{"text":", we define the relative entropy projection onto this set as","element":"span"}],[{"id":"id-245","style":{"width":"74%"},"width":1405,"height":82,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/84-19.png","element":"img"}],[{"text":"for all ","element":"span"},{"style":{"height":10.4},"width":67.65,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/85-0.png","element":"img","alt":" σ ∈","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":26.85},"width":217.46,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/85-1.png","element":"img","alt":"H(r)A,B), σ ≥","inline":true,"padRight":true},{"text":"0, where ","element":"span"},{"style":{"height":17.84},"width":281.07,"height":44.59,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/85-2.png","element":"img","alt":" dA ≡ �rk=1 dAk","inline":true},{"text":". Then, by replacing step 6 in Algorithm ","element":"span"},{"href":"#id-235","text":"6 ","element":"a"},{"text":"with ","element":"span"},{"text":"the projection in (","element":"span"},{"href":"#id-245","text":"D.38","element":"a"},{"text":"), we obtain a mirror descent algorithm for Choi states of multi-time quantum processes. Then, the analogue of Proposition ","element":"span"},{"href":"#id-244","text":"62 ","element":"a"},{"text":"is as follows. If ","element":"span"},{"style":{"height":23.43},"width":645.14,"height":58.58,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/85-3.png","element":"img","alt":" σ = 1dA Q, with Q ∈ COMBr, is an","inline":true,"padRight":true},{"text":"arbitrary Choi state of a multi-time process with ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r ","element":"span"},{"text":"steps, ","element":"span"},{"style":{"height":19.13},"width":391.75,"height":47.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/85-4.png","element":"img","alt":" L(1), L(2), . . . , L(T) ∈","inline":true,"padRight":true},{"text":"L(","element":"span"},{"style":{"height":26.85},"width":97.63,"height":67.13,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/85-5.png","element":"img","alt":"H(r)A,B","inline":true},{"text":") are cost ","element":"span"},{"text":"matrices satisfying ","element":"span"},{"style":{"height":22.44},"width":456.24,"height":56.1,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/85-6.png","element":"img","alt":" −1dAdB ≤ L(t) ≤ 1dAdB","inline":true},{"text":", with ","element":"span"},{"style":{"height":17.84},"width":290.88,"height":44.59,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/85-7.png","element":"img","alt":" dB ≡ �rk=1 dBk","inline":true},{"text":", and ","element":"span"},{"style":{"height":19.53},"width":325.2,"height":48.83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/85-8.png","element":"img","alt":" ρ(1), ρ(2), . . . , ρ(T)","inline":true,"padRight":true},{"text":"are the ","element":"span"},{"text":"projected Choi state updates resulting from the algorithm, then","element":"span"}],[{"style":{"width":"86%"},"width":1621,"height":122,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/85-9.png","element":"img"}],[{"text":"which is directly analogous to (","element":"span"},{"href":"#id-246","text":"D.29","element":"a"},{"text":"). ","element":"span"},{"style":{"height":10.4},"width":34,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/2406.04250/images/85-10.png","element":"img","alt":" ◀","inline":true}]]}],"_version":"3.3.4"},"paperNode":"$28:props:children:props:children:0:props:product"}]]]}]}]