Time Series Are Not That Different for LLMs | by Henry Lai | Jul, 2024

6. Bundling all these takeaways create a LTSM model (LTSM-Bundle) that outperforms all existing methods that re-programming LLM for time series and transformer based time series forecasting models.

Comparing the bundle with existing frameworks. Image by author.

Re-program a LTSM yourself!

Wanna try to re-program your own LTSM? Here is the tutorial for the LTSM-bundle: https://github.com/daochenzha/ltsm/blob/main/tutorial/README.md

Step 1: Create a virtual environment. Clone and install the requirements and the repository.

conda create -n ltsm python=3.8.0
conda activate ltsm
git clone git@github.com:daochenzha/ltsm.git
cd ltsm
pip3 install -e .
pip3 install -r requirements.txt

Step 2: Prepare your dataset. Make sure your local data folder like following:

- ltsm/
- datasets/
DATA_1.csv/
DATA_2.csv/
DATA_3.csv/
...

Step 3: Generating the time series prompts from training, validating, and testing datasets

python3 prompt_generate_split.py

Step 4: Find the generated time series prompts in the ‘./prompt_data_split’ folder. Then run the following command for finalizing the prompts:

# normalizing the prompts
python3 prompt_normalization_split.py --mode fit#export the prompts to the "./prompt_data_normalize_split" folder
python3 prompt_normalization_split.py --mode transform

Final Step: Train your own LTSM with Time Series Prompt and Linear Tokenization on gpt2-medium.

python3 main_ltsm.py \
--model LTSM \
--model_name_or_path gpt2-medium \
--train_epochs 500 \
--batch_size 10 \
--pred_len 96 \
--data_path "DATA_1.csv DATA_2.csv" \
--test_data_path_list "DATA_3.csv" \
--prompt_data_path "prompt_bank/prompt_data_normalize_split" \
--freeze 0 \
--learning_rate 1e-3 \
--downsample_rate 20 \
--output_dir [Your_Output_Path] \

Checkout more details in our paper and GitHub Repo:

Paper: https://arxiv.org/pdf/2406.14045
Code: https://github.com/daochenzha/ltsm/

Reference:

[1] Liu, Pengfei, et al. “Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing.” ACM Computing Surveys 55.9 (2023): 1–35.

[2] Liu, Xiao, et al. “Self-supervised learning: Generative or contrastive.” IEEE transactions on knowledge and data engineering 35.1 (2021): 857–876.

[3] Ansari, Abdul Fatir, et al. “Chronos: Learning the language of time series.” arXiv preprint arXiv:2403.07815 (2024).

Source link