-
Notifications
You must be signed in to change notification settings - Fork 0
/
Proofs.tex
134 lines (100 loc) · 8.42 KB
/
Proofs.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
\documentclass{article}
\usepackage{fullpage}
\usepackage{amsthm, amsmath}
\usepackage{url}
\newtheorem{claim}{Claim}
\newtheorem{definition}{Definition}
\newtheorem{theorem}{Theorem}
\newtheorem{lemma}{Lemma}
\newtheorem{corollary}{Corollary}
\newtheorem{observation}{Observation}
\newtheorem{question}{Problem}
\title{Improving 3SUM Time-Space Tradeoffs Proofs}
\author{Sam McCauley and Peter Zhao}
\date{}
\begin{document}
\maketitle
\section{Fiat-Naor Proofs}
\label{sec:fiat-naor}
\begin{lemma}
\label{3SUM-Indexing}
For any $0 < \delta < 1$, there exists an algorithm that solves the 3SUM-Indexing problem whose space usage is $O(n^{2 - \delta/3})$ and the cost of a query is $O(n^{\delta})$ time. This satisfies $T$ and $S$ such that $TS^3 = N^3$.
\end{lemma}
\begin{proof}
The algorithm consists of two parts - a carefully constructed function $f:[n]^2 \rightarrow [n]^2$ and a data structure for inverting $f$. Let $A+A$ be the sumset of a set of integers. The function $f$ is as follows:
Let $g: [n]^2 \rightarrow U$ be defined as $g(i_1, i_2) = a_{1,i_1} + a_{2,i_2}$. Define a function $h: U \rightarrow [n]^2$ that maps elements from $U$ to $2$ indices, $i'_1,i'_2$. $f$ is thus defined as $f(i_1,i_2) = h(g(i_1,i_2))$.
To invert $f$, we are given $c \in U$ during query time, and if there exists indices $i_1,i_2 \in A+A$ such that $a_{1,i_j} + a_{2,i_2} = c$, then
$$f(i_1,i_2) = h(g(i_1,i_2)) = h(a_{1,i_j} + a_{2,i_2})= h(c)$$
If an answer exists to a query $c$, then there exists $i_1,i_2 \in [n]$ such that $(i_1,i_2) \in f^{-1}(h(c))$
The Fiat-Naor inversion scheme gives us a tradeoff for inverting general functions using $O(S)$ space and $O(T)$ time. In our situation, $N=n^2$, so if we want our query time to be $T=O(n^\delta)$ for $0 < \delta < 1$, then we can choose to use space $S = O(n^{2-\delta/3})$.
\end{proof}
\begin{theorem}
\label{Fiat-Naor}
For any function $f:D \rightarrow D$ where $|D| = N$ and for any values $S$ and $T$ satisfying $TS^3 = N^3$, there exists an algorithm for inverting $f$ using using a deterministic data structure of space $O(S)$ and using time $O(T)$ with constant probability.
\end{theorem}
\begin{proof}
The Fiat-Naor inversion scheme provides us with a general space-time tradeoff for inverting functions: $TS^3 = N^3$. We can use this to invert the given function on the tradeoff. We use this theorem as a black-box when solving the problem, as it takes care of function inversion completely.
\end{proof}
\begin{lemma}
\label{Random Instances}
For any $0 < \delta < 1$, there exists an algorithm that solves the 3SUM-Indexing problem on a uniformly random instance whose space usage is $O(n^{2 - \delta/2})$ and the cost of a query is $O(n^{\delta})$ time
\end{lemma}
\begin{proof}
The Fiat-Naor inversion scheme gives us two tradeoffs: $TS^3 = N^3$ and $TS^2 = N^3q(f)$, where $q(f)$ is the probability that two randomly chosen elements from the domain have the same image. Note that whenever $1/S < q(f)$, then the first tradeoff is better than the second, as the probability of collision can only be reduced if we use greater than linear space, which is not optimal. One can also interpret the tradeoff as using $S$ space to reduce the probability of collision, which replaces $q(f)$ with $1/S$, reducing the first tradeoff as a special case of the second tradeoff. Thus, on random inputs, we know that the probability of collision is only $1/N$ where $N=n^2$, which is the probability of selecting an element in a uniform distribution. Thus, this gives us $TS^2 = N^3(1/N) = TS^2 = N^2$.
\end{proof}
\begin{lemma}
\label{Min and Max q(f) values}
Consider 3SUM-Indexing and let $|A| = n$. The smallest $q(f)$ can take is $O(1/n^2)$, and the largest $q(f)$ can take is $O(1/n)$
\end{lemma}
\begin{proof}
First, we define $q(f)$. Let $f:D \rightarrow D$ be the function that we are trying to invert, and let $I(y)$ be the number of preimages for $y$ under $f$, where $I(y) = |\{x \in D|f(x) = y\}|$. If we imagine the function as a bipartite graph mapping elements from $D$ to $D$, then $I(y)$ represents the indegree of $y$. $q(f)$ is then defined as the probability two randomly chosen elements from $D$ have the same image. More concretely, it is
$$q(f) = \frac{\sum_{i=0}^N I(i)^2}{N^2}$$
Our function $f$ maps elements in the domain to their sums, or mapping the sumset to sums. $q(f)$ in our case represents that number of elements in the sumset that collide in their sums, or counting the indegree of the sums.
The smallest $q(f)$ occurs when there are no collisions of sums, or when the function $f: A \rightarrow S$ has minimum in-degree (1). When all sums have a unique sumset, this is achieved, and there are a total of $O(n^2)$ possible sums. Thus, $q(f) = O(n^2/n^4)= O(1/n^2)$
The largest $q(f)$ occurs when we have the greatest amount of collisions, or when the function $f$ has maximum in-degree. When our sumset is of minimum size, or an arithmetic progression, then this occurs. In this scenario, there are a total of $O(n)$ possible sums. Thus, to determine $q(f)$, we count the number of sums that have a collision. This is exactly the partial sums of the sequence defined by $ceil(n^2/2)$, which is defined here: \url{https://oeis.org/A131941]}. There are many closed formula's of this sequence, but they are all $O(n^3)$. Thus, in the arithmetic sequence sumset case, $q(f) = O(n^3/n^4) = O(1/n)$.
\end{proof}
\section{Sumset Structure Proofs}
\label{sec:sumset}
\begin{lemma}
\label{Arithmetic Progression}
For every finite set $A \subseteq \mathbf{R}$ we have $|A+A| \geq 2|A|-1$, with equality if and only if A is an arithmetic progression of length $n$
\end{lemma}
\begin{proof}
Let $A$ be of size $n$, and write $A$ as an increasing set $\{a_1, a_2, a_3, ..., a_n\}$ such that $a_1 < a_2 < a_3 < ... < a_n$. Then, we have:
\begin{align*}
a_1 + a_1 < a_1 + a_2 < ... < a_1 + a_n < a_2 + a_n < ... < a_n + a_n
\end{align*}
This gives us $2n-1$ distinct elements of the sumset, which shows that the sumset must be at least size $2|A|-1$. Now, we show that if $|A+A| = 2|A|-1$, then it must be an arithmetic progression.
Let $A_i$ be a set of size $i$. If we consider $A_2$, then it is trivial as it will always be an arithmetic progression of two elements, and $|A_2 + A_2| = 3$. Assume this holds true for all $A_i$ up to $A_{n-1}$. Now, suppose we add a new element $a_n$, and assume without loss of generality that it is the largest element in $A_n$. Now, we must have two new elements to the sumset: $a_n + a_n$ and $a_n + a_{n-1}$, which gives us $|A_n + A_n| = 2{n-1}-1 + 2 = 2n-1$. These elements are larger than all other elements in the sumset. Therefore, it must be that $a_n + a_{n-2}$ is in $A_{n-1} + A_{n-1}$, making it the largest element in the sumset, except for perhaps $a_{n-1} + a_{n-1}$. Thus, $a_n + a_{n-2} = a_{n-1} + a_{n-1}$, meaning that $A_{n}$ must be an arithmetic progression.
\end{proof}
\begin{lemma}
\label{Largest sumset}
For every finite set $A \subseteq \mathbf{R}$ we have $|A+A| \leq \frac{|A|(|A| + 1)}{2}$
\end{lemma}
\begin{proof}
Consider the set $A = \{1,2,2^2,2^3,...,2^{n-1}\}$ such that $|A| = n$. If we compute the sumset of this, we notice that no sum is seen twice, because every term is more than twice greater the previous, so each next element constructs an entire set of new sums, meaning it is as large as possible. The size of this set is effectively a combination of the elements with replacement, which is:
\begin{align*}
\binom{n+r-1}{r} = \binom{|A|+2-1}{2} = \binom{|A|+1}{2} = \frac{|A|(|A| + 1)}{2}
\end{align*}
\end{proof}
\section{BSG Proofs}
\label{sec:bsg}
\begin{corollary}
\label{BSG Corollary}
Let $A,B,S$ be finite sets, where $A,B$ are the sets in the 3SUM problem, and $S$ is a given set to determine the solution to 3SUM. Suppose that $|A||B| = O(N^2)$ and $|S| \leq tN$. For any $\alpha < 1$, there exists subsets $A_1,...,A_k \subseteq A$ and $B_1,...,B_k \subseteq B$ such that
\begin{enumerate}
\item the remainder set $R = \{a,b \in A \times B : a+b \in S\} \setminus \bigcup^k_{i=1} (A_i \times B_i)$ has size at most $\alpha N^2$
\item $|A_i + B_i| \leq O((1/\alpha)^5t^3N)$ for each $i=1,...,k$
\item $k = O(1/\alpha)$
\end{enumerate}
\end{corollary}
\section{Other Proofs}
\label{sec:other}
\begin{lemma}
\label{Difficulty of 3SUM-Indexing}
If there exists a good algorithm for solving 3SUM-Indexing on a small sumset, then there is a good algorithm for all cases.
\end{lemma}
\begin{proof}
All instances of 3SUM-Indexing are lists of integers and their sumsets.
\end{proof}
\end{document}