Part 1: The Beginning and the Big Picture -- The Day ChatGPT Beat the Fund Managers

Introduction

“AI that predicts stock prices” — until recently, that sounded like something out of science fiction.

I am a software engineer who builds web services as an independent developer. I have always been interested in stock investing, but I had never formally studied technical analysis or fundamental analysis. I was, for all intents and purposes, an ordinary individual investor.

In this series, I want to share how I went from that starting point to fine-tuning an LLM (Large Language Model) for stock price prediction and launching it as a live web service.

I intend to be honest about everything — not just the successes, but the failures and detours as well. If you are an independent developer interested in trying LLM fine-tuning, or if you are curious about combining AI with financial data, I hope you will find something useful here.


ChatGPT Beat the Fund Managers

In 2023, Finder, a UK-based financial comparison site, ran an experiment that caught a lot of attention.

They had ChatGPT pick 38 stocks and build a portfolio. Over a 63-day period, that portfolio returned +4.9%. During the same period, the top 10 most popular funds in the UK (HSBC, Fidelity, etc.) averaged -0.8% — ChatGPT had significantly outperformed professional fund managers.

Over a cumulative two-year period, the ChatGPT portfolio reached +41.97% in returns, while the popular funds averaged +27.63% — a gap of more than 14 percentage points.

Of course, Finder themselves cautioned that “this does not mean you should use ChatGPT for investing.” It was a limited experiment, and the results could have been a matter of luck. Different market conditions might have yielded entirely different outcomes.

Still, what the experiment demonstrated was significant:

LLMs can “understand” a company’s situation from natural language and potentially apply that understanding to investment decisions.

“Wait, really? Then couldn’t the same thing work with Japanese stock market news?”

That was the starting point of this project.


Why an LLM? — How It Differs from Traditional Approaches

When you hear “stock price prediction,” the first things that come to mind are probably time-series models like LSTM (Long Short-Term Memory) or ARIMA (AutoRegressive Integrated Moving Average). These methods find patterns in historical price data and use them to predict future movements.

They are great at handling numerical data, but they have one major limitation: they cannot understand the content of news articles.

For example, suppose a company announces earnings with a “51% upward revision to ordinary income.” An LSTM can learn patterns from historical price charts, but it cannot comprehend the meaning of that news headline and reason about how it might affect the stock price the next day.

An LLM, on the other hand, can understand natural language. It can grasp that a “51% upward revision” is positive news, and furthermore, it can combine that understanding with context like the company’s industry, market capitalization, and recent price trends to predict the next day’s price movement.

Integrating natural language information from news with numerical data like stock prices and financial metrics to generate predictions. That was the reason I chose LLM fine-tuning as my approach.


What I Built — The Senrigan Service

What I ultimately built is an AI stock price prediction web service called Senrigan (meaning “clairvoyance” in Japanese).

Service URL: https://senrigan.tech/

Senrigan takes five types of data as input and predicts the next day’s stock price movement:

#Data TypeDetails
1Company informationIndustry, market capitalization, company description, etc.
2News article textEarnings announcements, PR releases, equity-related disclosures, etc.
3Stock price dataOHLCV (Open, High, Low, Close, Volume) for the last 5 trading days
4Financial data2 years of revenue, profit margins, EPS, ROA, ROE
5Macroeconomic indicatorsCPI, GDP, unemployment rate, policy interest rates, exchange rates

It then outputs three prediction values in JSON format:

  • Close today -> Open tomorrow (overnight movement)
  • Close today -> Close tomorrow (full-day movement)
  • Open tomorrow -> Close tomorrow (intraday movement)

Here is what an actual prediction result looks like:

{
  "close_to_next_open":  {"price": 3045, "change_pct": 0.0,  "trend": "neutral"},
  "close_to_next_close": {"price": 3075, "change_pct": 0.98, "trend": "neutral"},
  "next_open_to_close":  {"price": 3075, "change_pct": 1.0,  "trend": "neutral"}
}

When news is published, the system automatically collects data, the AI predicts the next day’s stock price movement, and the results are published on the website. It runs every day without human intervention.


Overall Architecture

The Senrigan service is composed of three projects:

+---------------------+     +--------------------+     +----------------+
|       meloik        |     |  assetai_firebase  |     |   stockSite    |
|  News collection    |     |  Firestore sync    |     |    Web UI      |
|  AI prediction      | --> |                    | --> |                |
|  Data generation    |     |                    |     |                |
+---------------------+     +--------------------+     +----------------+
    VPS (PHP/MySQL)           VPS (Python)              Vercel (Next.js)

meloik (Data Generation / PHP + MySQL)

This is where everything begins. It is a collection of PHP batch processes running on a VPS (virtual private server), responsible for:

  • News collection: Gathering publicly disclosed corporate information from sources like TDnet (Timely Disclosure network)
  • AI prediction: Sending prediction requests to the fine-tuned LLM
  • Translation: Translating news and prediction results into English (multilingual support)
  • Data generation: Collecting and formatting company information, stock prices, and financial data

The core prediction flow looks something like this:

// Fetch prediction data (company info + news + prices + financials + macro indicators)
$jsonData = Utility::getPredictionData($db, $company['code'], $start_date, $end_date);

// Send prediction request to OpenAI API
$response = callFineTunedModel($jsonData);

// Save prediction results to MySQL
savePrediction($db, $code, $target_date, $response);

assetai_firebase (Firestore Sync / Python)

This project exports data from meloik’s MySQL database to Firebase (Firestore).

Why Firestore? Because it allows the frontend (Next.js) to retrieve data directly in a serverless manner. Exposing MySQL directly would be a security risk, and standing up a separate API server adds cost. By placing Firestore in between, the frontend can safely retrieve data using the Firestore SDK.

# Fetch data from MySQL and write to Firestore
def save_to_firestore(collection_name, doc_id, data, force=False):
    doc_ref = db.collection(collection_name).document(doc_id)
    existing_doc = doc_ref.get()

    new_epoch = data.get("updated_at_epoch")

    if existing_doc.exists and not force:
        existing_data = existing_doc.to_dict()
        old_epoch = existing_data.get("updated_at_epoch")
        if old_epoch and new_epoch and old_epoch >= new_epoch:
            return False  # Skip if existing data is newer (cost reduction)

    doc_ref.set(data)
    return True

On weekdays, incremental sync runs every 15 minutes, ensuring the latest data is always reflected in Firestore.

stockSite (Web UI / Next.js + Vercel)

This is the frontend that users actually see. It is built with Next.js and deployed on Vercel.

It fetches Firestore data using ISR (Incremental Static Regeneration) with a 5-minute cache interval. This keeps Firestore read costs down while displaying near-real-time information.


Data Flow — From News to Prediction Display

Here is the complete data flow of the Senrigan service, laid out chronologically:

1. News is published
   +-> meloik collects the news and stores it in MySQL

2. Prediction batch kicks off
   +-> Fetches company info, prices, financials, and macro indicators from the DB
   +-> Assembles the five data types into JSON
   +-> Sends the JSON to the fine-tuned LLM (OpenAI API)
   +-> Saves prediction results to MySQL

3. Translation batch kicks off
   +-> Translates news and prediction reasoning into English

4. Firestore sync
   +-> assetai_firebase exports incremental changes from MySQL to Firestore

5. Web display
   +-> stockSite fetches data from Firestore and renders it on screen

All of these batch processes are scheduled via crontab and run automatically during market hours on weekdays. No human intervention is required.


The Road to LLM Fine-Tuning

Now, here is the main topic. The “AI prediction” component I mentioned in the architecture overview — the process of building a fine-tuned LLM — is the central theme of this series.

To cut to the conclusion: I ultimately used OpenAI API’s fine-tuning feature to customize the gpt-4o-mini model. But getting there involved three major phases:

Phase 1: Local GPU (RTX 3060)
         -> Gave up due to insufficient VRAM

Phase 2: Fine-tuning on Google Colab
         -> Tried 3 models with much trial and error, could not achieve sufficient accuracy

Phase 3: OpenAI API Fine-tuning
         -> Training completed in about 8 minutes, adopted for production

Honestly, Phase 1 and Phase 2 are stories of “things that didn’t work out.” But it was precisely because of that trial and error that I gained a deep understanding of LLM fine-tuning, and I believe it enabled me to make the right decision in Phase 3.


The Models I Tried

Here is a list of all the models that appear throughout this series, encountered during the process of trial and error:

#ModelParametersPhaseResult
1ELYZA Llama-3-JP-8B8B (8 billion)Phase 1, 2Ran out of VRAM locally. Training ran on Colab but accuracy was insufficient
2llm-jp-3-7.2b-instruct37.2B (7.2 billion)Phase 2Implemented additional training pipeline but accuracy was insufficient
3rinna/japanese-gpt2-medium-Phase 2Used for GGUF conversion practice
4gpt-4o-mini-Phase 3Adopted for production. Stable JSON output and sufficient accuracy

I initially tried open-source models because I wanted to “have my own LLM running locally.” In the end, though, API-based fine-tuning turned out to be the practical solution for an independent developer.


Technologies and Methods Used

I used a variety of technologies throughout the fine-tuning process. Detailed explanations will come in later installments, but here is an overview:

TechnologyOverviewCoverage in This Series
QuantizationReducing the precision of model weights to save memoryExplained in Part 3
LoRA (Low-Rank Adaptation)Fine-tuning by training only a small number of additional parametersExplained in Part 3
SFTTrainerHuggingFace’s supervised fine-tuning trainerUsed in Part 4
GGUF conversionConverting models for local inferenceUsed in Part 4
OpenAI Fine-tuning APIAPI-based fine-tuningDetailed in Part 5

Series Roadmap

This series is planned for a total of 8 installments. Here is a brief summary of each:

PartTitleContent
Part 1 (this article)The Beginning and the Big PictureProject motivation, Senrigan service overview, overall architecture
Part 2The Local GPU Challenge and DefeatTaking on ELYZA 8B with an RTX 3060, and giving up due to VRAM limitations
Part 3LoRA and Quantization ExplainedIllustrated explanation of the lightweight fine-tuning techniques
Part 4Stock Prediction on ColabTrial and error with 3 models — from the Mount Fuji experiment to real data
Part 5OpenAI API Fine-TuningFrom the pivot in strategy to training completion in just 8 minutes
Part 6Training Data DesignIntegrating 5 data types, data cleaning, and creating ground-truth labels
Part 7Choosing a Translation LLMFrom DeepSeek to ChatGPT — how chasing low costs led to a painful lesson
Part 8MySQL to Firestore Migration and ProductionThe RDB-to-NoSQL challenges and cost optimization strategies

While the focus is on technical content, I also plan to share the decision-making process as an independent developer and the lessons learned from failures along the way.


The Context of Independent Development

There is one thing I want to emphasize.

This project is, through and through, independent development. I do not have access to abundant GPU clusters, nor do I have a team of data scientists. All I had was a laptop with an RTX 3060, Google Colab’s free tier, and some OpenAI API credits.

Within those constraints, “how to realistically leverage LLMs” is the consistent theme running through this entire series. Even without cutting-edge GPUs or large-scale infrastructure, with enough ingenuity, you can integrate LLMs into your own service. I would be happy if I can show you that path.

In the next installment, I will tell the story of my first challenge — attempting fine-tuning on a local GPU (RTX 3060) — and how it ended in spectacular defeat.


Next: Part 2 — “The Local GPU Challenge and Defeat: Taking on an 8B Model with 6 GB of VRAM on the RTX 3060”

Share this article

Related Posts