Part 2: Recreating Economic Cycles with Real Japanese News

The Data Foundation

A game about investing is only as good as its economic data. Players need to feel that the events they encounter are real, that the stock price movements make sense, and that the economic cycles follow a believable pattern. Fake data produces a fake learning experience.

I decided to ground MarketQuest in actual Japanese economic history. The data pipeline I built processes three distinct sources — historical news events, macroeconomic indicators, and simulated stock prices — and merges them into a unified dataset that powers every turn of the game.

Step 1: Collecting Historical News Events

The first challenge was gathering a comprehensive timeline of historical events that affected the Japanese economy. I built a web scraper that systematically extracted events from Wikipedia pages for each year from 1953 to 2023.

import requests
from bs4 import BeautifulSoup

def fetch_events(year):
    """Scrape historical events from the Japanese Wikipedia page for a given year."""
    url = f"https://ja.wikipedia.org/wiki/{year}年"
    response = requests.get(url)
    response.encoding = response.apparent_encoding

    soup = BeautifulSoup(response.text, 'html.parser')
    events_section = soup.find('span', {'id': 'できごと'}).parent

    events = []
    months = events_section.find_all_next(['h3', 'ul'])
    for element in months:
        if element.name == 'h3':
            current_month = element.text.strip()
            if '月' not in current_month:
                break
        elif element.name == 'ul':
            for item in element.find_all('li'):
                text = item.get_text().strip()
                if ' - ' in text:
                    date, event = text.split(' - ', 1)
                    events.append({
                        'date': f"{year}年{date}",
                        'event': event
                    })
    return events

This script processed over 70 years of Japanese Wikipedia articles, producing a raw dataset of thousands of historical events. The output was stored as structured JSON:

[
    {
        "Date": "1953年1月15日",
        "できごと": "早川電機（現：シャープ）が、国産初のテレビ、TV3-14T 175000円を発売。"
    },
    {
        "Date": "1953年2月1日",
        "できごと": "NHKが日本で初のテレビジョン本放送を東京で開始。"
    },
    {
        "Date": "1953年3月5日",
        "できごと": "ソ連の指導者・スターリンが死去したことにより株価が暴落（スターリン暴落）。"
    }
]

Not every historical event is relevant to a children’s investment game. The next step was curating: selecting events that were economically significant, understandable to children, and mappable to specific economic phases (boom, recession, crisis). Each selected event was tagged with its impact on stock prices and gold prices.

Step 2: Building Macro Economic Indicators

Raw news events alone do not create a believable economic simulation. Stock prices respond to macroeconomic forces — GDP growth, inflation, interest rates, employment, and money supply. I needed time-series data for all of these indicators.

The CreateMacro.ipynb notebook assembled monthly macroeconomic data from 1953 to 2022, covering 17 indicators:

{
    "Date": "1953/1/1",
    "Inflation_Rate": "2.74",
    "CPI": "13.19",
    "Interest_Rate": "5.84",
    "FX": "360",
    "GDP_Growth": "11.73",
    "Employment": "4044",
    "Unemployment_Rate": "1.9",
    "Nikkei": "412.02",
    "Money_Supply": "83.26",
    "Population_index": "93.70",
    "Birth_rate": "2.06",
    "WTI_energy": "2.57"
}

The data came from official Japanese government statistics, Bank of Japan publications, and international economic databases. Quarterly data was interpolated to monthly granularity using linear expansion:

# Expand quarterly GDP data to monthly
for i in range(len(quarterly_df)):
    start_date = quarterly_df.iloc[i]['Date']
    for j in range(3):
        new_date = start_date + pd.DateOffset(months=j)
        expanded_df = pd.concat([
            expanded_df,
            pd.DataFrame({"Date": [new_date], "Value": [quarterly_df.iloc[i]['Value']]})
        ], ignore_index=True)

Before these indicators could feed into the stock simulation, they needed to be normalized. The Normaralize.ipynb notebook applied Min-Max scaling to bring all indicators into the 0-1 range:

from sklearn.preprocessing import StandardScaler

def normalize_dataframe(df):
    """Apply Min-Max normalization to all numeric columns."""
    result = df.copy()
    for feature_name in df.columns:
        if feature_name != 'Date':
            max_value = df[feature_name].max()
            min_value = df[feature_name].min()
            result[feature_name] = (df[feature_name] - min_value) / (max_value - min_value)
    return result

normalized_macro = normalize_dataframe(df_macro)

This normalization was critical. Without it, indicators with large absolute values (like the Nikkei index at 38,000) would dominate the simulation while indicators with small values (like birth rate at 1.3) would have no effect.

Step 3: Simulating Stock Prices

With curated news events and normalized macro indicators, I could simulate realistic stock price movements. The SimulateStock.ipynb notebook combined both data sources to generate monthly stock prices for six fictional companies that mirror real Japanese industry sectors.

The simulation uses a configurable impact factor system:

impact_factors = {
    'volatility': 1.5,
    'macro_factors': {
        'Inflation_Rate': 0.0001,
        'Interest_Rate': -0.0001,
        'GDP_Growth': 0.0002,
        'Money_Supply': 0.0001,
        'Employment': 0.0002,
        'Unemployment_Rate': -0.0002,
        'Population_index': 0.0002,
        'Birth_rate': 0.0001,
        'Salary_production': 0.0002,
        'WTI_energy': 0.0003
    }
}

Each macro factor has a coefficient that determines how strongly it influences stock prices. Notice the signs: GDP growth and employment push prices up, while unemployment and interest rates push them down. This mirrors real-world relationships that children can intuitively understand.

News events add discrete shocks on top of the macro trend:

news_impacts = {
    pd.Timestamp('1953-02-01'): 0.03,   # TV broadcasting begins
    pd.Timestamp('1953-03-01'): -0.05,  # Stalin's death crash
    pd.Timestamp('1953-07-01'): 0.04,   # Recovery
    pd.Timestamp('1958-10-01'): -0.02,  # Recession
    pd.Timestamp('1959-03-01'): 0.05    # Recovery
}

The simulation validates its output by comparing simulated price trajectories against the actual Nikkei 225 index. The goal is not exact replication — it is a game, not a financial model — but the general shape of booms and busts should feel right to anyone who knows Japanese economic history.

News cutscene in action — the 2011 Great East Japan Earthquake

Step 4: Mapping Events to Economy Phases

The game uses a three-phase economy model: GOOD, NORMAL, and BAD. Each news event in the final game data is tagged with its phase, which in turn affects the probability distribution on Chance spaces.

In the production TypeScript code, this mapping is explicit:

// From game-core/events.ts
export const NEWS_EVENT_DATA: NewsEventData[] = [
  // 1985: Plaza Accord — yen strengthens, mixed impact
  { id: 'news_1985_plaza', year: 1985, phase: 'NORMAL',
    stockImpact: 0.05, goldImpact: 0.02, isHistorical: true },

  // 1986: Bubble begins — massive stock rally
  { id: 'news_1986_bubble', year: 1986, phase: 'GOOD',
    stockImpact: 0.20, goldImpact: -0.05, isHistorical: true },

  // 1990: Bubble bursts — catastrophic crash
  { id: 'news_1990_bubble_burst', year: 1990, phase: 'BAD',
    stockImpact: -0.30, goldImpact: 0.15, isHistorical: true },

  // 2008: Lehman Brothers collapse
  { id: 'news_2008_lehman', year: 2008, phase: 'BAD',
    stockImpact: -0.35, goldImpact: 0.20, isHistorical: true },

  // 2013: Abenomics begins — major rally
  { id: 'news_2013_abenomics', year: 2013, phase: 'GOOD',
    stockImpact: 0.20, goldImpact: -0.05, isHistorical: true },
];

Notice the inverse relationship between stockImpact and goldImpact. When stocks crash, gold rises — and vice versa. This is not arbitrary; it reflects the real-world flight-to-safety behavior where investors move money into gold during crises. Children playing the game naturally discover this pattern: “When bad news happens, my stocks go down but my gold goes up.”

Step 5: Generating Child-Friendly Explanations

Raw historical data is useless if a seven-year-old cannot understand it. Every news event in the game has a summary field written in simple Japanese (or English, in the i18n version) that explains what happened and why it matters.

The text is served through a dictionary-based i18n system. The game-core engine only stores event IDs and numeric impacts. The actual text lives in locale files and is injected at runtime through a TextProvider interface:

// From game-core/events.ts
export interface TextProvider {
  getNewsText(id: string): { title: string; summary: string } | null;
  getBoomText(id: string): { title: string; story: string } | null;
  getCrashText(id: string): { title: string; story: string } | null;
  getStockName(type: StockType): string;
  getIndustryName(type: string): string;
  // ... more text retrieval methods
}

This separation means the game logic is entirely language-agnostic. Adding a new language requires only adding a new locale file — no changes to the simulation engine.

The Final Data Flow

Here is the complete pipeline from raw sources to game-ready data:

Wikipedia articles (1953-2023)
        ↓
   correctNews.ipynb
   [Web scraping + curation]
        ↓
   news_YYYY_YYYY.json
        ↓                      Government statistics
        ↓                              ↓
        ↓                      CreateMacro.ipynb
        ↓                      [Assembly + interpolation]
        ↓                              ↓
        ↓                      macro_YYYY_YYYY.json
        ↓                              ↓
        ↓                      Normaralize.ipynb
        ↓                      [Min-Max scaling]
        ↓                              ↓
        ↓                      macro_norm_YYYY_YYYY.csv
        ↓                              ↓
        └──────────────┬───────────────┘
                       ↓
              SimulateStock.ipynb
              [Price generation]
                       ↓
              Stock price time series
                       ↓
              import_data.sql
              [Final game data]
                       ↓
            game-core TypeScript constants

Four Jupyter notebooks, three JSON intermediate formats, and one SQL import script transform 70 years of Japanese economic history into a game that a child can play in 15 minutes.

What Makes This Data Pipeline Special

Most educational games use fabricated scenarios. MarketQuest uses real history. When a player lands on a News space in 1990 and reads about the bubble bursting, the stock prices in their portfolio actually crash by 30% — just as the real Nikkei 225 did. When they reach 2013 and Abenomics begins, they feel the rally.

This grounding in reality is what separates education from entertainment. The game is fun because it is a game. But it teaches because the data is real.

In Part 3, I will show how all of this data comes to life in the browser — the game-core architecture that keeps logic framework-agnostic, the Phaser 3 board that renders animations, and the Next.js shell that ties it all together.

Previous: Part 1: Teaching Kids to Invest — Why We Chose a Game

Next: Part 3: Building a Production Browser Game with Next.js and Phaser 3