Part 3: Predicting 'Non-Boring' Menus with LSTM Time Series

Introduction

In Part 2, I built a cosine similarity engine that could find nutritionally equivalent recipe substitutions. It worked well for answering “what else has the same nutrition?” but it had a fundamental blind spot: it had no memory.

Cosine similarity does not know what you ate yesterday. It might suggest grilled chicken on Monday, Tuesday, and Wednesday — nutritionally sound, but nobody wants to eat the same thing three nights in a row.

The real challenge of daily meal planning is temporal variety. You need to consider not just “what is nutritionally good” but “what is nutritionally good given what we have been eating lately.” This is a sequence prediction problem.

And that is when I had an insight that changed the direction of the project entirely.

At the time, I was reading about natural language processing and recurrent neural networks. Language models predict the next word in a sequence based on the preceding words. A good language model generates text that is coherent and avoids repetition — it does not produce the same word over and over.

Then it hit me: menu prediction is structurally identical to text generation.

Text Generation	Menu Prediction
Vocabulary = words	Vocabulary = menu IDs
Sentence = sequence of words	Week = sequence of daily menus
Predict next word	Predict next day’s menu
Avoid repetition = good prose	Avoid repetition = varied meals

If I mapped each menu to an ID (like a word in a vocabulary) and treated a sequence of daily meals as a “sentence,” I could use the exact same architecture that generates text to generate meal plans.

This analogy was not just a metaphor — it directly determined the implementation.

Building the Dictionary

First, I needed to map menus to integer IDs, just like a word-to-index dictionary in NLP:

import pandas as pd
import numpy as np

# Load historical meal data (date + menu_id)
history = pd.read_csv("meal_history.csv")
history = history.sort_values("date").reset_index(drop=True)

# Build menu dictionary (menu_id -> integer index)
unique_menus = history["menu_id"].unique()
menu_to_idx = {menu: idx for idx, menu in enumerate(unique_menus)}
idx_to_menu = {idx: menu for menu, idx in menu_to_idx.items()}

vocab_size = len(menu_to_idx)
print(f"Vocabulary size: {vocab_size} unique menus")

# Convert meal history to integer sequence
sequence = [menu_to_idx[m] for m in history["menu_id"]]
print(f"Sequence length: {len(sequence)} days")

This gave me a vocabulary of unique menus and a chronological sequence of integer IDs — exactly the same data structure used in character-level or word-level language models.

One-Hot Encoding

For the LSTM input, each menu ID is represented as a one-hot vector — a vector of zeros with a single 1 at the position corresponding to that menu’s index:

from tensorflow.keras.utils import to_categorical

# One-hot encode the full sequence
sequence_array = np.array(sequence)
one_hot_sequence = to_categorical(sequence_array, num_classes=vocab_size)

print(f"One-hot shape: {one_hot_sequence.shape}")
# (N, vocab_size) — each day is a vector of length vocab_size

Sliding Window: Creating Training Samples

The LSTM learns to predict the next menu given the previous window_size days. I used a sliding window of 7 days to create training samples:

window_size = 7  # Look at the past 7 days to predict day 8

X = []  # Input sequences (past 7 days)
y = []  # Target (day 8)

for i in range(len(one_hot_sequence) - window_size):
    X.append(one_hot_sequence[i:i + window_size])
    y.append(one_hot_sequence[i + window_size])

X = np.array(X)
y = np.array(y)

print(f"Training samples: {X.shape[0]}")
print(f"Input shape:  {X.shape}")   # (samples, 7, vocab_size)
print(f"Target shape: {y.shape}")   # (samples, vocab_size)

Each training sample says: “given these 7 days of meals, the next day’s meal was this.” The LSTM learns the patterns of what follows what.

The LSTM Model

The model architecture is straightforward — a single LSTM layer followed by a dense output layer with softmax activation:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout

model = Sequential([
    LSTM(128, input_shape=(window_size, vocab_size)),
    Dropout(0.2),
    Dense(vocab_size, activation="softmax")
])

model.compile(
    loss="categorical_crossentropy",
    optimizer="adam",
    metrics=["accuracy"]
)

model.summary()

Key design decisions:

128 LSTM units: Enough capacity to learn weekly patterns without overfitting on a relatively small dataset
Dropout 0.2: Light regularization to prevent the model from memorizing specific sequences
Softmax output: Produces a probability distribution over all possible menus for the next day

# Train the model
history = model.fit(
    X, y,
    epochs=100,
    batch_size=32,
    validation_split=0.2,
    verbose=1
)

Temperature Sampling: The Diversity Parameter

Here is where the “text generation” analogy pays off most directly. In language models, temperature controls how creative or conservative the output is. The same concept applies perfectly to menu prediction.

def sample_with_temperature(predictions, temperature=1.0):
    """
    Sample from the prediction distribution with temperature scaling.

    temperature < 1.0: More conservative (sticks to high-probability menus)
    temperature = 1.0: Standard sampling (follows learned distribution)
    temperature > 1.0: More adventurous (explores less common menus)
    """
    predictions = np.asarray(predictions).astype("float64")

    # Apply temperature scaling
    log_preds = np.log(predictions + 1e-8) / temperature
    exp_preds = np.exp(log_preds)
    probabilities = exp_preds / np.sum(exp_preds)

    # Sample from the adjusted distribution
    sampled_index = np.random.choice(len(probabilities), p=probabilities)

    return sampled_index

This is the same sampling function used in text generation models like GPT — and it works beautifully for menu planning:

Temperature	Behavior	Meal Planning Effect
0.3	Very conservative	Sticks to safe, frequently-eaten staples
0.7	Moderately creative	Good balance of familiar and new dishes
1.0	Standard	Follows the learned probability distribution as-is
1.5	Adventurous	Suggests unusual combinations, higher variety

Generating a Week of Menus

def generate_menu_sequence(model, seed_sequence, days=7, temperature=0.7):
    """Generate a sequence of daily menus starting from a seed week."""
    current_sequence = seed_sequence.copy()
    generated_menus = []

    for _ in range(days):
        # Take the last `window_size` days as input
        input_seq = np.array([current_sequence[-window_size:]])

        # Get prediction probabilities
        predictions = model.predict(input_seq, verbose=0)[0]

        # Sample with temperature
        next_menu_idx = sample_with_temperature(predictions, temperature)
        next_menu_id = idx_to_menu[next_menu_idx]

        generated_menus.append(next_menu_id)

        # Append to sequence for next prediction
        next_one_hot = to_categorical([next_menu_idx], num_classes=vocab_size)[0]
        current_sequence.append(next_one_hot)

    return generated_menus


# Generate next week's menus with different temperatures
seed = list(X[-1])  # Use the most recent week as seed

print("=== Conservative (temp=0.3) ===")
conservative = generate_menu_sequence(model, seed, days=7, temperature=0.3)
for i, menu_id in enumerate(conservative, 1):
    print(f"  Day {i}: Menu {menu_id}")

print("\n=== Balanced (temp=0.7) ===")
balanced = generate_menu_sequence(model, seed, days=7, temperature=0.7)
for i, menu_id in enumerate(balanced, 1):
    print(f"  Day {i}: Menu {menu_id}")

print("\n=== Adventurous (temp=1.5) ===")
adventurous = generate_menu_sequence(model, seed, days=7, temperature=1.5)
for i, menu_id in enumerate(adventurous, 1):
    print(f"  Day {i}: Menu {menu_id}")

At low temperature, the model tends to suggest meals that frequently appear in the training data — reliable choices but not exciting. At high temperature, it explores the long tail of the menu vocabulary, sometimes surfacing dishes that rarely appear but add welcome variety.

A temperature of around 0.7 hit the sweet spot: familiar enough to be practical, varied enough to keep things interesting across the week.

Results and Observations

What the LSTM Learned

The trained model captured several real patterns from the historical meal data:

Weekly cycles: Certain types of meals (like fish-based dishes or stew-style cooking) tended to appear on specific days of the week, and the model picked up on this rhythm.
Consecutive avoidance: The model naturally learned not to predict the same menu two days in a row, simply because that pattern was rare in the training data.
Seasonal clusters: Menus from the same season tended to cluster together — lighter dishes in summer, heavier stews in winter.

Limitations

Data hunger: The LSTM needed a substantial meal history to learn meaningful patterns. With only a few months of data, the predictions were essentially random.
No nutritional awareness: Unlike the cosine similarity approach from Part 2, the LSTM had no concept of nutrition. It learned patterns purely from sequence data. A complete system would need to combine both approaches.
Cold start problem: For a new user with no meal history, the model has nothing to work with. This is the classic cold-start problem in recommendation systems.

Combining the Two Approaches

The ideal meal planning system would combine both methods:

LSTM predicts candidates based on temporal patterns (what makes sense given recent meals)
Cosine similarity filters those candidates to ensure nutritional balance

This two-stage approach — diversity from the LSTM, nutritional soundness from cosine similarity — addresses the weaknesses of each method individually.

However, there was still one problem neither approach could solve: the recipes themselves were often too complex for everyday cooking. That is where, years later, a completely different technology entered the picture.

Previous: Part 2: Finding “Same Nutrition, Different Meal” with Cosine Similarity

Next: Part 4: Transforming 20,000 Recipes with ChatGPT