First, some basics of your implementation:
- Type-hint your arguments; I had to read and investigate a bunch to deduce that 
tau is a positive integer 
(y-x) <= tau does not need parens 
- save 
math.sqrt( n1 * n2 ) to a variable instead of repeating it 
Full matrix implementation
This is an n**2 operation, which is on the order of 64,000,000 elements if your arrays are 8,000 elements long. That's beyond the point where I would have given up and used C and something like BLAS. If you actually care about performance, now is the time.
Your solution scales according to event density, being slower as event density increases. There is a solution that is constant-time in density:
- Don't call 
np.where 
- Don't use set intersection; use vectorized comparison
 
- Recognize that you can represent your 
c_ij summation as the sum over a partial triangular matrix (ij upper, ji lower) stopping at the diagonal set by tau 
- The matrix itself is formed by the broadcast of the two event vectors, and receives a mask for the upper and lower halves
 
- The initial values for 
c_** are the halved trace (diagonal sum) of this broadcast matrix 
- For a given 
tau and vector lengths, you could save time by pre-calculating the masks (not shown in the reference code). This is very important and can reduce execution time of this method by about two thirds. Many kinds of datasets in the wild have constant dimensions and can benefit from this. 
Somewhere in around 85% ones-density, this method has equivalent performance to the original (for the test vector length of 2000) if you don't pre-calculate the mask.
Sparse approach
A library that supports sparse matrices should do better than the above. Scipy has them. Things to note about this approach:
- The input data need to be inverted - 1 for events, 0 everywhere else
 
- Be careful not to call 
np.* methods; exclusively use sparse methods 
- Rather than forming the triangular bands and then summing at the end, it's faster to perform a difference-of-sums using 
tau 
- This scales with density, unlike the full-matrix approach, and uses less memory
 
Sorted logarithmic traversal
Riffing on @vpn's idea a little:
- This doesn't need to be n**2 if you tell Numpy to correctly apply a binary search
 
- This binary search can be vectorized via 
searchsorted 
- Rather than your Python-native set intersection, you can use 
intersect1d 
For very high ones-densities, this method remains the fastest; but has poorer scaling characteristics than the full-matrix trace-only method for lower ones-densities (higher event densities).
Full or sparse matrix, traces only
(Yet another) approach is:
- For either sparse or full matrix
 
- non-masked
 
- based on diagonal trace sum only, with no calls to the triangular methods
 
- constant time in density
 
Particularly for the full matrix, this method is very fast: always faster than sparse, masked broadcast and original; and faster than logarithmic traversal for densities below ~70%.
Cumulative sum
For completeness the reference source below includes @KellyBundy's cumulative sum approach, which is competitive for all but very high densities.
Output
All method durations (ms)
  d%    trace   sorted   cumsum      old    bcast   sparse sparsetr
50.0     0.70     4.12     0.21   491.82    41.73    80.62    21.62 
55.0     0.64     3.27     0.18   411.49    40.09    65.46    18.25 
60.0     0.58     2.69     0.18   311.74    40.69    50.05    14.60 
65.0     0.58     2.27     0.23   250.12    41.50    35.52    13.45 
70.0     1.03     1.84     0.18   182.37    41.84    24.03    10.20 
75.0     1.34     1.40     0.18   113.49    42.36    16.37     7.31 
80.0     0.61     1.06     0.18    77.04    40.99    11.78     6.11 
85.0     0.81     0.69     0.17    44.68    52.77    10.30     5.95 
90.0     1.56     0.56     0.22    24.50    46.26     6.32     3.94 
95.0     1.50     0.26     0.18     6.27    40.60     4.80     3.48 
96.0     1.77     0.21     0.18     3.71    41.01     4.25     3.31 
96.5     1.79     0.17     0.18     1.89    40.69     4.16     3.24 
97.0     1.79     0.16     0.18     1.75    43.89     4.69     3.30 
97.5     1.87     0.14     0.18     1.16    43.34     4.31     3.22 
98.0     2.10     0.15     0.19     1.22    41.16     4.10     3.33 
98.5     1.75     0.10     0.18     0.42    40.75     3.94     3.15 
99.0     1.73     0.09     0.22     0.16    40.71     3.97     3.11 
99.5     1.78     0.08     0.18     0.09    41.29     4.09     3.42 
Fast method durations (us)
  d%    trace   sorted   cumsum
50.0      604     3993      124 
55.0      556     3504      124 
60.0      577     2895      125 
65.0      569     2440      130 
70.0      556     2041      126 
75.0      574     1372      125 
80.0      556     1078      123 
85.0      562      686      126 
90.0      563      477      128 
95.0      561      238      126 
96.0      558      176      126 
96.5      569      139      124 
97.0      557      137      131 
97.5      568      104      124 
98.0      601      116      128 
98.5      565       61      128 
99.0      567       51      129 
99.5      566       43      127 
Reference source
import math
from numbers import Real
from timeit import timeit
from typing import Tuple
import numpy as np
from itertools import product
import scipy.sparse
from scipy.sparse import spmatrix
def old_eps(ts_1, ts_2, tau):
    Q_tau = 0
    q_tau = 0
    event_index1, = np.where(np.array(ts_1) == 0)
    n1 = event_index1.shape[0]
    event_index2, = np.where(np.array(ts_2) == 0)
    n2 = event_index2.shape[0]
    if (n1 != 0 and n2 != 0):
        matching_idx = set(event_index1).intersection(event_index2)
        c_ij = c_ji = 0.5 *len(matching_idx)
        for x,y in product(event_index1,event_index2):
            if x-y > 0 and (x-y)<= tau:
                c_ij += 1
            elif y-x > 0 and (y-x) <= tau:
                c_ji += 1
        Q_tau = (c_ij+c_ji)/math.sqrt( n1 * n2 )
        q_tau = (c_ij - c_ji)/math.sqrt( n1 * n2 )
    return Q_tau, q_tau
def bcast_eps(ts_1: np.ndarray, ts_2: np.ndarray, tau: int) -> Tuple[Real, Real]:
    if ts_1.shape != ts_2.shape or len(ts_1.shape) != 1:
        raise ValueError('Vectors must be flat and of the same length')
    N, = ts_1.shape
    events_1 = ts_1 == 0
    events_2 = ts_2 == 0
    n1 = np.sum(events_1)
    n2 = np.sum(events_2)
    if n1 == 0 or n2 == 0:
        return 0, 0
    all_true = np.ones((N, N), dtype=bool)
    upper_mask = np.logical_or(np.tril(all_true), np.triu(all_true, k=+1+tau))
    lower_mask = np.logical_or(np.triu(all_true), np.tril(all_true, k=-1-tau))
    matches = np.logical_and(events_1[np.newaxis, :], events_2[:, np.newaxis])
    matches_u = np.ma.array(matches, mask=upper_mask)
    matches_l = np.ma.array(matches, mask=lower_mask)
    n_matches = np.trace(matches)
    c_ij = c_ji = n_matches / 2
    c_ij += np.sum(matches_u)
    c_ji += np.sum(matches_l)
    den = math.sqrt(n1 * n2)
    Q_tau = (c_ij + c_ji) / den
    q_tau = (c_ij - c_ji) / den
    return Q_tau, q_tau
def trace_eps(ts_1: np.ndarray, ts_2: np.ndarray, tau: int) -> Tuple[Real, Real]:
    if ts_1.shape != ts_2.shape or len(ts_1.shape) != 1:
        raise ValueError('Vectors must be flat and of the same length')
    events_1 = ts_1 == 0
    events_2 = ts_2 == 0
    n1 = np.sum(events_1)
    n2 = np.sum(events_2)
    if n1 == 0 or n2 == 0:
        return 0, 0
    matches = np.logical_and(events_1[np.newaxis, :], events_2[:, np.newaxis])
    n_matches = np.trace(matches)
    c_ij = c_ji = n_matches / 2
    for k in range(1, tau+1):
        c_ij += np.trace(matches, k)
        c_ji += np.trace(matches, -k)
    den = math.sqrt(n1 * n2)
    Q_tau = (c_ij + c_ji) / den
    q_tau = (c_ij - c_ji) / den
    return Q_tau, q_tau
def sorted_eps(ts_1: np.ndarray, ts_2: np.ndarray, tau: int) -> Tuple[Real, Real]:
    if ts_1.shape != ts_2.shape or len(ts_1.shape) != 1:
        raise ValueError('Vectors must be flat and of the same length')
    event_index1, = np.where(np.array(ts_1) == 0)
    event_index2, = np.where(np.array(ts_2) == 0)
    n1, = event_index1.shape
    n2, = event_index2.shape
    if n1 == 0 or n2 == 0:
        return 0, 0
    n_matches, = np.intersect1d(
        event_index1, event_index2, assume_unique=True,
    ).shape
    c_ij = c_ji = n_matches/2
    insertions = np.searchsorted(
        a=event_index1,  # array to pretend insertion into
        v=event_index2,  # values to insert
    )
    for insertion, y in zip(insertions, event_index2):
        i1 = insertion
        if i1 < n1:
            if y == event_index1[i1]:
                i1 += 1
            for x in event_index1[i1:]:
                if x - y > tau:
                    break
                c_ij += 1
        i2 = insertion - 1
        if i2 >= 0:
            for x in event_index1[i2::-1]:
                if y - x > tau:
                    break
                c_ji += 1
    den = math.sqrt(n1 * n2)
    Q_tau = (c_ij + c_ji) / den
    q_tau = (c_ij - c_ji) / den
    return Q_tau, q_tau
def sparse_eps(ts_1: spmatrix, ts_2: spmatrix, tau: int) -> Tuple[Real, Real]:
    if ts_1.shape != ts_2.shape or len(ts_1.shape) != 2 or ts_1.shape[0] != 1:
        raise ValueError('Vectors must be flat and of the same length')
    n1 = ts_1.sum()
    n2 = ts_2.sum()
    if n1 == 0 or n2 == 0:
        return 0, 0
    matches = ts_2.T * ts_1
    matches_u = scipy.sparse.triu(matches, +1).sum() - scipy.sparse.triu(matches, k=+1+tau).sum()
    matches_l = scipy.sparse.tril(matches, -1).sum() - scipy.sparse.tril(matches, k=-1-tau).sum()
    n_matches = matches.diagonal().sum()
    c_ij = c_ji = n_matches / 2
    c_ij += matches_u
    c_ji += matches_l
    den = math.sqrt(n1 * n2)
    Q_tau = (c_ij + c_ji) / den
    q_tau = (c_ij - c_ji) / den
    return Q_tau, q_tau
def sparsetr_eps(ts_1: spmatrix, ts_2: spmatrix, tau: int) -> Tuple[Real, Real]:
    if ts_1.shape != ts_2.shape or len(ts_1.shape) != 2 or ts_1.shape[0] != 1:
        raise ValueError('Vectors must be flat and of the same length')
    n1 = ts_1.sum()
    n2 = ts_2.sum()
    if n1 == 0 or n2 == 0:
        return 0, 0
    matches = ts_2.T * ts_1
    n_matches = matches.diagonal().sum()
    c_ij = c_ji = n_matches / 2
    for k in range(1, tau+1):
        c_ij += matches.diagonal(+k).sum()
        c_ji += matches.diagonal(-k).sum()
    den = math.sqrt(n1 * n2)
    Q_tau = (c_ij + c_ji) / den
    q_tau = (c_ij - c_ji) / den
    return Q_tau, q_tau
def cumsum_eps(ts_1: spmatrix, ts_2: spmatrix, tau: int) -> Tuple[Real, Real]:
    ts_1 = 1 - ts_1
    ts_2 = 1 - ts_2
    n1 = ts_1.sum()
    n2 = ts_2.sum()
    if n1 == 0 or n2 == 0:
        return 0, 0
    cs1 = np.pad(ts_1.cumsum(), (0, tau), 'edge')
    cs2 = np.pad(ts_2.cumsum(), (0, tau), 'edge')
    c_ij = c_ji = 0.5 * (ts_1 * ts_2).sum()
    c_ij += ((cs1[tau:] - cs1[:-tau]) * ts_2).sum()
    c_ji += ((cs2[tau:] - cs2[:-tau]) * ts_1).sum()
    sqrt = math.sqrt(n1 * n2)
    Q_tau = (c_ij + c_ji) / sqrt
    q_tau = (c_ij - c_ji) / sqrt
    return Q_tau, q_tau
def test() -> None:
    shape = 2, 2000
    full_ts = np.ones(shape, dtype=np.uint8)
    sparse_ts = scipy.sparse.lil_matrix(shape, dtype=np.uint8)
    density = 0.9
    rand = np.random.default_rng(seed=0).random
    events = rand(shape) > density
    full_ts[events] = 0
    sparse_ts[events] = 1
    # Add some interesting values to test boundary conditions: on the diagonal
    full_ts[:, 5] = 0
    sparse_ts[:, 5] = 1
    Q_exp = 1.9446638724075895
    q_exp = 0.026566446344365977
    methods = (
        (old_eps, full_ts),
        (bcast_eps, full_ts),
        (trace_eps, full_ts),
        (sorted_eps, full_ts),
        (sparse_eps, sparse_ts),
        (sparsetr_eps, sparse_ts),
        (cumsum_eps, full_ts),
    )
    for eps, ts in methods:
        Q_tau, q_tau = eps(*ts, tau=10)
        assert math.isclose(Q_tau, Q_exp, abs_tol=1e-12, rel_tol=0)
        assert math.isclose(q_tau, q_exp, abs_tol=1e-12, rel_tol=0)
def compare(fast: bool) -> None:
    shape = 2, 2000
    rand = np.random.default_rng(seed=0).random
    n = 400 if fast else 1
    methods = (
        trace_eps,
        sorted_eps,
        cumsum_eps,
    )
    if fast:
        print('Fast method durations (us)')
        factor = 1e6
        fmt = '{:8.0f}'
    else:
        print('All method durations (ms)')
        factor = 1e3
        fmt = '{:8.2f}'
        methods += (
            old_eps,
            bcast_eps,
            sparse_eps,
            sparsetr_eps,
        )
    print('  d%', ' '.join(
        f'{method.__name__.removesuffix("_eps"):>8}'
        for method in methods
    ))
    densities = np.hstack((
        np.linspace(0.50, 0.95, 10),
        np.linspace(0.960, 0.995, 8),
    ))
    for density in densities:
        print(f'{100*density:4.1f}', end=' ')
        full_ts = np.ones(shape, dtype=np.uint8)
        sparse_ts = scipy.sparse.lil_matrix(shape, dtype=np.uint8)
        events = rand(shape) > density
        full_ts[events] = 0
        sparse_ts[events] = 1
        inputs = (
            full_ts,
            full_ts,
            full_ts,
        )
        if not fast:
            inputs += (
                full_ts,
                full_ts,
                sparse_ts,
                sparse_ts,
            )
        for eps, ts in zip(methods, inputs):
            t = timeit(lambda: eps(*ts, tau=10), number=n)
            print(fmt.format(t/n*factor), end=' ')
        print()
    print()
if __name__ == '__main__':
    test()
    compare(False)
    compare(True)