Part 3: Predicting 'Non-Boring' Menus with LSTM Time Series
![]()
Introduction
In Part 2, I built a cosine similarity engine that could find nutritionally equivalent recipe substitutions. It worked well for answering “what else has the same nutrition?” but it had a fundamental blind spot: it had no memory.
Cosine similarity does not know what you ate yesterday. It might suggest grilled chicken on Monday, Tuesday, and Wednesday — nutritionally sound, but nobody wants to eat the same thing three nights in a row.
The real challenge of daily meal planning is temporal variety. You need to consider not just “what is nutritionally good” but “what is nutritionally good given what we have been eating lately.” This is a sequence prediction problem.
And that is when I had an insight that changed the direction of the project entirely.
The Key Insight: Menu Prediction as Text Generation
At the time, I was reading about natural language processing and recurrent neural networks. Language models predict the next word in a sequence based on the preceding words. A good language model generates text that is coherent and avoids repetition — it does not produce the same word over and over.
Then it hit me: menu prediction is structurally identical to text generation.
| Text Generation | Menu Prediction |
|---|---|
| Vocabulary = words | Vocabulary = menu IDs |
| Sentence = sequence of words | Week = sequence of daily menus |
| Predict next word | Predict next day’s menu |
| Avoid repetition = good prose | Avoid repetition = varied meals |
If I mapped each menu to an ID (like a word in a vocabulary) and treated a sequence of daily meals as a “sentence,” I could use the exact same architecture that generates text to generate meal plans.
This analogy was not just a metaphor — it directly determined the implementation.
Building the Dictionary
First, I needed to map menus to integer IDs, just like a word-to-index dictionary in NLP:
import pandas as pd
import numpy as np
# Load historical meal data (date + menu_id)
history = pd.read_csv("meal_history.csv")
history = history.sort_values("date").reset_index(drop=True)
# Build menu dictionary (menu_id -> integer index)
unique_menus = history["menu_id"].unique()
menu_to_idx = {menu: idx for idx, menu in enumerate(unique_menus)}
idx_to_menu = {idx: menu for menu, idx in menu_to_idx.items()}
vocab_size = len(menu_to_idx)
print(f"Vocabulary size: {vocab_size} unique menus")
# Convert meal history to integer sequence
sequence = [menu_to_idx[m] for m in history["menu_id"]]
print(f"Sequence length: {len(sequence)} days")
This gave me a vocabulary of unique menus and a chronological sequence of integer IDs — exactly the same data structure used in character-level or word-level language models.
One-Hot Encoding
For the LSTM input, each menu ID is represented as a one-hot vector — a vector of zeros with a single 1 at the position corresponding to that menu’s index:
from tensorflow.keras.utils import to_categorical
# One-hot encode the full sequence
sequence_array = np.array(sequence)
one_hot_sequence = to_categorical(sequence_array, num_classes=vocab_size)
print(f"One-hot shape: {one_hot_sequence.shape}")
# (N, vocab_size) — each day is a vector of length vocab_size
Sliding Window: Creating Training Samples
The LSTM learns to predict the next menu given the previous window_size days. I used a sliding window of 7 days to create training samples:
window_size = 7 # Look at the past 7 days to predict day 8
X = [] # Input sequences (past 7 days)
y = [] # Target (day 8)
for i in range(len(one_hot_sequence) - window_size):
X.append(one_hot_sequence[i:i + window_size])
y.append(one_hot_sequence[i + window_size])
X = np.array(X)
y = np.array(y)
print(f"Training samples: {X.shape[0]}")
print(f"Input shape: {X.shape}") # (samples, 7, vocab_size)
print(f"Target shape: {y.shape}") # (samples, vocab_size)
Each training sample says: “given these 7 days of meals, the next day’s meal was this.” The LSTM learns the patterns of what follows what.
The LSTM Model
The model architecture is straightforward — a single LSTM layer followed by a dense output layer with softmax activation:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
model = Sequential([
LSTM(128, input_shape=(window_size, vocab_size)),
Dropout(0.2),
Dense(vocab_size, activation="softmax")
])
model.compile(
loss="categorical_crossentropy",
optimizer="adam",
metrics=["accuracy"]
)
model.summary()
Key design decisions:
- 128 LSTM units: Enough capacity to learn weekly patterns without overfitting on a relatively small dataset
- Dropout 0.2: Light regularization to prevent the model from memorizing specific sequences
- Softmax output: Produces a probability distribution over all possible menus for the next day
# Train the model
history = model.fit(
X, y,
epochs=100,
batch_size=32,
validation_split=0.2,
verbose=1
)
Temperature Sampling: The Diversity Parameter
Here is where the “text generation” analogy pays off most directly. In language models, temperature controls how creative or conservative the output is. The same concept applies perfectly to menu prediction.
def sample_with_temperature(predictions, temperature=1.0):
"""
Sample from the prediction distribution with temperature scaling.
temperature < 1.0: More conservative (sticks to high-probability menus)
temperature = 1.0: Standard sampling (follows learned distribution)
temperature > 1.0: More adventurous (explores less common menus)
"""
predictions = np.asarray(predictions).astype("float64")
# Apply temperature scaling
log_preds = np.log(predictions + 1e-8) / temperature
exp_preds = np.exp(log_preds)
probabilities = exp_preds / np.sum(exp_preds)
# Sample from the adjusted distribution
sampled_index = np.random.choice(len(probabilities), p=probabilities)
return sampled_index
This is the same sampling function used in text generation models like GPT — and it works beautifully for menu planning:
| Temperature | Behavior | Meal Planning Effect |
|---|---|---|
| 0.3 | Very conservative | Sticks to safe, frequently-eaten staples |
| 0.7 | Moderately creative | Good balance of familiar and new dishes |
| 1.0 | Standard | Follows the learned probability distribution as-is |
| 1.5 | Adventurous | Suggests unusual combinations, higher variety |
Generating a Week of Menus
def generate_menu_sequence(model, seed_sequence, days=7, temperature=0.7):
"""Generate a sequence of daily menus starting from a seed week."""
current_sequence = seed_sequence.copy()
generated_menus = []
for _ in range(days):
# Take the last `window_size` days as input
input_seq = np.array([current_sequence[-window_size:]])
# Get prediction probabilities
predictions = model.predict(input_seq, verbose=0)[0]
# Sample with temperature
next_menu_idx = sample_with_temperature(predictions, temperature)
next_menu_id = idx_to_menu[next_menu_idx]
generated_menus.append(next_menu_id)
# Append to sequence for next prediction
next_one_hot = to_categorical([next_menu_idx], num_classes=vocab_size)[0]
current_sequence.append(next_one_hot)
return generated_menus
# Generate next week's menus with different temperatures
seed = list(X[-1]) # Use the most recent week as seed
print("=== Conservative (temp=0.3) ===")
conservative = generate_menu_sequence(model, seed, days=7, temperature=0.3)
for i, menu_id in enumerate(conservative, 1):
print(f" Day {i}: Menu {menu_id}")
print("\n=== Balanced (temp=0.7) ===")
balanced = generate_menu_sequence(model, seed, days=7, temperature=0.7)
for i, menu_id in enumerate(balanced, 1):
print(f" Day {i}: Menu {menu_id}")
print("\n=== Adventurous (temp=1.5) ===")
adventurous = generate_menu_sequence(model, seed, days=7, temperature=1.5)
for i, menu_id in enumerate(adventurous, 1):
print(f" Day {i}: Menu {menu_id}")
At low temperature, the model tends to suggest meals that frequently appear in the training data — reliable choices but not exciting. At high temperature, it explores the long tail of the menu vocabulary, sometimes surfacing dishes that rarely appear but add welcome variety.
A temperature of around 0.7 hit the sweet spot: familiar enough to be practical, varied enough to keep things interesting across the week.
Results and Observations
What the LSTM Learned
The trained model captured several real patterns from the historical meal data:
-
Weekly cycles: Certain types of meals (like fish-based dishes or stew-style cooking) tended to appear on specific days of the week, and the model picked up on this rhythm.
-
Consecutive avoidance: The model naturally learned not to predict the same menu two days in a row, simply because that pattern was rare in the training data.
-
Seasonal clusters: Menus from the same season tended to cluster together — lighter dishes in summer, heavier stews in winter.
Limitations
-
Data hunger: The LSTM needed a substantial meal history to learn meaningful patterns. With only a few months of data, the predictions were essentially random.
-
No nutritional awareness: Unlike the cosine similarity approach from Part 2, the LSTM had no concept of nutrition. It learned patterns purely from sequence data. A complete system would need to combine both approaches.
-
Cold start problem: For a new user with no meal history, the model has nothing to work with. This is the classic cold-start problem in recommendation systems.
Combining the Two Approaches
The ideal meal planning system would combine both methods:
- LSTM predicts candidates based on temporal patterns (what makes sense given recent meals)
- Cosine similarity filters those candidates to ensure nutritional balance
This two-stage approach — diversity from the LSTM, nutritional soundness from cosine similarity — addresses the weaknesses of each method individually.
However, there was still one problem neither approach could solve: the recipes themselves were often too complex for everyday cooking. That is where, years later, a completely different technology entered the picture.
Previous: Part 2: Finding “Same Nutrition, Different Meal” with Cosine Similarity