Spoiler alert: If you just want to check parity, you only need to look at the last bit of a number's binary representation. But what if, just for fun, you tried to guess whether a number is odd or even by analyzing wavelet features extracted from its binary signal and clustering those features? Surprisingly, this quirky method achieves almost 70% accuracy — not bad for something completely unnecessary.
Motivation
Parity (odd/even) is one of the simplest integer properties, and can be computed directly with:
y = n % 2
But what if we approached this simple problem using advanced signal processing techniques, just for the sake of exploration and learning?
Idea: Convert numbers into binary bit signals, extract features at multiple resolutions using wavelets, cluster those features using unsupervised methods, and try to infer parity from the cluster structure.
This is a fun way to explore signal representation, unsupervised learning, feature engineering, and multi-scale analysis.
Step 1: Binary Signal Representation
Each number n
is converted into a binary signal:
x = (x₁, x₂, ..., x_L), where xᵢ ∈ {0,1}
The signal is zero-padded to the nearest power-of-two length to make it compatible with wavelet decomposition.
Step 2: Wavelet Decomposition
We apply multi-level discrete wavelet transform (DWT) using the Haar wavelet. The result is a set of coefficient vectors, each representing signal information at a different resolution level.
Step 3: Feature Extraction per Level
At each wavelet level, we extract 3 statistical features from the coefficients:
- Energy: sum of squares of the coefficients
- L2 Norm: Euclidean norm
- Mean Absolute Value: average of absolute values
These features describe the strength and variation of the signal at each resolution.
Step 4: Clustering Each Feature Separately
For each of the 3 features at each wavelet level, we apply KMeans clustering with 2 clusters. Since the ground truth labels (odd/even) are not used during training, we later match clusters to parity classes based on which cluster contains more odd numbers.
This allows us to interpret each feature cluster as representing "more likely odd" or "more likely even".
Step 5: Estimating Probability of Oddness
For each number and each feature, we compute the probability of being odd as:
the fraction of odd numbers in the cluster it belongs to.
This gives us a 3D array of probabilities: (number, level, feature)
.
Step 6: Weighted Score and Final Prediction
We combine the probabilities into a single score using weighted averaging:
- Each wavelet level is weighted (higher levels can have more influence).
- Each feature is equally weighted.
The final oddness score Sₙ
is computed as:
Sₙ = weighted_average(Pₙ)
Then:
- Predict odd if
Sₙ > 0.5
- Predict even otherwise
Results: Surprisingly Good
On numbers from 1 to 999:
- Final Accuracy: 69.67%
This is significantly better than random guessing (50%) and surprisingly high given the simplicity of the true rule (just check the last bit).
Visualization
A scatter plot shows final scores for each number, colored by actual parity:
plt.figure(figsize=(14,6))
plt.scatter(numbers, final_scores, c=['red' if l==1 else 'blue' for l in labels], label='True Odd (Red) / Even (Blue)')
plt.axhline(0.5, color='green', linestyle='--', label='Decision Threshold (0.5)')
plt.title("Wavelet Features + KMeans: Predicted Probability of Being Odd")
plt.xlabel("Number")
plt.ylabel("Score")
plt.legend()
plt.show()
Final Python Code
import numpy as np
import pywt
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt
def number_to_bit_signal(num):
bit_str = bin(num)[2:]
bit_arr = np.array([int(b) for b in bit_str])
pad_len = 2**int(np.ceil(np.log2(len(bit_arr)))) - len(bit_arr)
return np.pad(bit_arr, (0, pad_len), 'constant')
def extract_features(num, wavelet='haar', max_level=3):
sig = number_to_bit_signal(num)
max_possible_level = pywt.dwt_max_level(len(sig), pywt.Wavelet(wavelet).dec_len)
level = min(max_level, max_possible_level) if max_possible_level > 0 else 1
coeffs = pywt.wavedec(sig, wavelet, level=level)
features = []
for c in coeffs:
energy = np.sum(c**2)
norm = np.linalg.norm(c)
mean_abs = np.mean(np.abs(c))
features.append([energy, norm, mean_abs])
return features
numbers = np.arange(1, 1000)
labels = numbers % 2
all_features = [extract_features(n) for n in numbers]
max_level = max(len(f) for f in all_features)
features_by_level = []
for lvl in range(max_level):
lvl_feats = []
for f in all_features:
if lvl < len(f):
lvl_feats.append(f[lvl])
else:
lvl_feats.append([0,0,0])
features_by_level.append(np.array(lvl_feats))
probabilities = np.zeros((len(numbers), max_level, 3))
for lvl in range(max_level):
for feat_idx in range(3):
X = features_by_level[lvl][:, feat_idx].reshape(-1, 1)
kmeans = KMeans(n_clusters=2, random_state=42).fit(X)
clusters = kmeans.labels_
cluster0_mean = np.mean(labels[clusters==0])
cluster1_mean = np.mean(labels[clusters==1])
tek_cluster = 0 if cluster0_mean > cluster1_mean else 1
for i, cl in enumerate(clusters):
members = (clusters == cl)
prob_tek = np.mean(labels[members])
probabilities[i, lvl, feat_idx] = prob_tek
level_weights = np.linspace(0.5, 1.5, max_level)
feature_weights = np.array([1, 1, 1])
weighted_probs = probabilities * level_weights.reshape(1, max_level, 1) * feature_weights.reshape(1, 1, 3)
final_scores = np.sum(weighted_probs, axis=(1,2)) / np.sum(level_weights) / np.sum(feature_weights)
predicted_labels = (final_scores > 0.5).astype(int)
accuracy = np.mean(predicted_labels == labels)
print(f"Final Accuracy: {accuracy*100:.2f}%")
plt.figure(figsize=(14,6))
plt.scatter(numbers, final_scores, c=['red' if l==1 else 'blue' for l in labels], label='True Odd (Red) / Even (Blue)')
plt.axhline(0.5, color='green', linestyle='--', label='Decision Threshold (0.5)')
plt.title("Wavelet Features + KMeans: Predicted Probability of Being Odd")
plt.xlabel("Number")
plt.ylabel("Score")
plt.legend()
plt.show()
What Did We Learn?
- You can classify parity with nearly 70% accuracy using wavelet features and KMeans — which is weirdly impressive.
- Wavelets can extract structured signals even from binary data.
- This is a playful but educational exercise in signal processing and clustering.
Next Steps
- Try other features (skewness, entropy, etc.)
- Use supervised models (e.g., logistic regression)
- Apply to less trivial binary classification problems
TL;DR
A fancy, multi-level wavelet + clustering approach to guess odd/even numbers yields ~70% accuracy. Useless? Absolutely. Fun? Totally.
Top comments (0)