Examine the relationship between breaking news and exchange rates in Python.

I don’t trust almost everything about forex system trading. I don’t know if you can make money with a system that was developed by a huge fund or by a few individual traders who spent a lot of time on it, but the other “automated forex trading systems” that are lying around on the internet are not supposed to be easy for anyone to make money with! It’s a scam. In other words, almost all of them are scams.

Also, I don’t believe in technical analysis by looking at stock or currency charts. I’m an amateur when it comes to investment and speculation, so I can’t speak highly of it, but if I had to guess, I’d say “technical analysis is effective” -> “it can be systemized” -> “automatic trading and compounding profits”. Is the world that naive? “There are markets where technical analysis is not effective!” There are people who say, “Technical analysis is not always correct, but it is used to increase the winning rate! There are people who say that “technical analysis is not always correct, it’s just for increasing the winning rate!” If you are talking about probability theory, you can systemize it and make the expected value higher than 1, so you can “trade automatically and make a huge profit with compound interest”. I’m not sure.

I am not a professional trader, so the above may be wrong, but that is not what I want to write about this time. As a programmer, I’m interested in seeing if I can increase the expected value of a trade with the help of a system. And as I write this even before I do it, I probably can’t create a system that will automatically make a profit. I’ll explain why later. Also, please note that what I will try is something that someone else would have done years ago.

Does economic news change the exchange rate?

As mentioned above, I am skeptical about technical analysis. Therefore, here is one hypothesis.
Hypothesis: After the release of economic information, the price will change to reflect the information.

I would like to assume that it is the economic information, the news, and the breaking indexes that cause the exchange rate to fluctuate, although not as much as the efficient market hypothesis. To put it more correctly, the economic reality, conditions, and policies of the world’s economies should cause the exchange rate to fluctuate, but it is only when these are announced in the form of news that traders around the world become aware of them and the price fluctuates. It would be great if we could recognize the real situation before it is published as news, but I can’t do that. Also, it seems impossible for the system to recognize and collect the actual state of the world economy.
So the first thing I’ll do is to investigate how the rates have been fluctuating since the economic news was released.

Use Rxxters information to get exchange rate statistics.

I have Rxxters' currency related information (2016~2021). Yes, that famous *Rxxters information. Okay? I’ll say it again, it’s Rxxters Info. And Rxxters' information seems to be one of the fastest in the world, and seems to be the best source of information for this verification. In total, it contains about 25,000 pieces of forex economic information, news, and breaking news. The time period is about five years. I’m going to focus on the last five years, since the current trend is probably very different from the trend of the exchange rate 20 years ago.

And next, get the 1-minute information for USD/JPY, EUR/JPY, and GBP/JPY. You can easily get it in CSV format by Googling. The data is as follows.

USDJPY,20190212,221000,110.46,110.46,110.46,110.46,4
USDJPY,20190212,221100,110.46,110.46,110.46,110.46,4
USDJPY,20190212,221200,110.46,110.46,110.46,110.46,4
USDJPY,20190212,221300,110.46,110.46,110.46,110.46,4
USDJPY,20190212,221400,110.46,110.46,110.46,110.46,4
USDJPY,20190212,221500,110.46,110.46,110.46,110.46,4

It represents the exchange rate from 22:10 on 2/12/2019 to 22:15 on 2/12/2019 in US dollars. This will be prepared from 2016 to around the end of 2021. In terms of the number of lines, it is about 2.2 million lines of csv. I prepared three of these.

The next step is to put the Rxxters information into sqlite for easy handling. I’ll also share it for later use in machine learning (deep learning).

import mojimoji
import MeCab
import sqlite3
def modify_text(text):
  '''
  Use full-width text, adjust line breaks and spaces, and use mecab to make text shared.
  '''
  text = mojimoji.han_to_zen(text)
  text = text.lstrip().rstrip()
  text = text.replace('\n','')
  text = text.replace('\r','')
  text = text.replace('\t','')
  text = text.replace('　','')
  text = text.replace(' ','')
  mecab = MeCab.Tagger('-d /usr/lib/x86_64-linux-gnu/mecab/dic/mecab-ipadic-neologd')
  mecab.parse('')
  m1 = mecab.parse(text)
  ret = []
  for row in m1.split("\n"):
      word =row.split("\t")[0]
      if word == "EOS":
          continue
      ret.append(word)
  return " ".join(ret)

if __name__ == '__main__':
  dbname = 'fx.db'
  conn = sqlite3.connect(dbname)
  cur = conn.cursor()
  for news in news_list:
    title = modify_text(news["title"])
    body = modify_text(news["body"])
    datetime = news["datetime"]
    cur.execute(f'INSERT INTO corpus(title, body, date) values("{title}", "{body}", "{datetime}")')
  conn.commit()
  cur.close()
  conn.close()

Define rate fluctuation patterns

Now that we have the 5-year Rxxters information and the csv of the exchange rate, we need to define the rate fluctuation pattern. We need to define the rate fluctuation pattern as follows.

Pattern 0: Rate does not fluctuate more than 0.4%.

Pattern 1: Rate increases over 0.4%.

Pattern 2: Rate drops by more than 0.4%.

Pattern 3: Rates rise or fall by more than 0.4%.

The word “up” here means that the yen has weakened when considered in terms of the yen exchange rate. It also means n% compared to the starting rate. The reason why 0.4% is used is because the leverage of Japanese FX is limited to 25 times, so a 10% change in the amount invested is quite large. (You can change it later.)
Next, we need to assume how long it takes for the information to be reflected in the rate after it is released. In this case, let’s assume the following.

Reflection time: within 3 hours of information release

The assumption is that the information will be fully incorporated within three hours of its release. Of course, it does not take three hours for the price to reflect the information, and it is quite common for the trend to continue for more than three hours. However, we have to break it down somewhere, so we’ll assume 3 hours.

In other words, to sum it up, it means “which pattern (0 to 3) will it be within 3 hours from the time the information is released*? And if you can figure out which pattern it will be when the information is released, well… I’ll be a billionaire?

Let’s look at the data from 2016 to 2021.

Check the number of patterns from the information release by doing the following.

import sqlite3
import csv
def get_csv_data():
    with open(f'USDJPY.txt') as f: 
        reader = csv.reader(f)
        ret = []
        for row in reader: ret.append(row)
        return ret

def get_db_data():
    dbname = 'fx.db'
    conn = sqlite3.connect(dbname)
    cur = conn.cursor()
    cur.execute('SELECT id, title, body, date FROM corpus ORDER BY datetime(date)')
    ret = cur.fetchall()
    cur.close()
    conn.close()
    dic = {}
    for d in ret:
        t = d[3].replace(" ", "").replace("-", "").replace(":", "")[:-2] + "00"
        dic[t] = d[1] + ' ' + d[2]
    return dic

def get_type(csv, i):
    start = float(csv[i][3])
    values = []
    # Set the value for the future 3 hours
    for plus in range(180):
        ind = i + plus
        if ind >= len(csv): break
        values.append(float(csv[ind][3]))
    max_val = max(values)
    min_val = min(values)
    # Whether or not there was a change of more than 0.4%.
    rate = 0.004
    if max_val > (start * (1 + rate)) and min_val < (start * (1 - rate)):
        return 3
    elif max_val > (start * (1 + rate)):
        return 1
    elif min_val < (start * (1 - rate)):
        return 2
    else:
        return 0

if __name__ == '__main__':
    data = get_db_data()
    csv = get_csv_data()
    i = -1
    types = {}
    for c in csv:
        i += 1
        db_data = data.get(c[1] + c[2])
        if db_data is None: continue
        t = get_type(csv, i)
        if types.get(str(t)) is None:
            types[str(t)] = 1
        else:
            types[str(t)] += 1
    print(types)

The source code is pretty messy, but it is disposable code, so please forgive me.

Output of USD/JPY

{'0': 24822, '1': 567, '2': 615, '3': 9}

Output of EUR/JPY

{'0': 24050, '1': 753, '2': 767, '3': 15}

Output of GBP/JPY

{'0': 23019, '1': 1357, '2': 1562, '3': 49}

I knew it, but I could see that the volatility was Pound > Euro > Dollar. And it turns out that most of the information falls into the “no change” category. Of course, the rotor information we prepared this time is not just for each currency pair, but includes all the information, so you can imagine that the majority of the information does not react at all.
Also, perhaps because the period is 2016~2021, a period with no major fluctuations, the “over 0.4% increase” and “over 0.4% decrease” are almost equal. It’s as if the rate is moving in a coin toss.

Does insider information work in forex?

Using this data set, we can get an idea of the volatility changes that would occur if we acted one hour before every news release.

for c in csv:
  i += 1
  db_data = data.get(c[1] + c[2])
  if db_data is None: continue
  t = get_type(csv, i - 60) # Know the information an hour in advance and hold the position.

Output of USD/JPY

{'0': 24781, '1': 546, '2': 682, '3': 4}

Output of EUR/JPY

{'0': 24069, '1': 721, '2': 777, '3': 18}

Output of GBP/JPY

{'0': 23074, '1': 1353, '2': 1509, '3': 51}

This seems to be contrary to expectations, acting an hour ahead of time doesn’t change the volatility significantly. Does this mean that the impact of breaking economic news and other information on rate volatility is quite small? This means that even if you get the information an hour earlier than the rotor, it may not change the outcome significantly. But obviously it includes the majority of news that we don’t react to, so we don’t really know.

AI to analyze news and trade forex

So what if AI (I don’t want to call machine learning AI) can analyze information and mechanically hold positions? When I say AI, I simply mean that it can recognize the news and classify it into one of the above four patterns by comparing it with past data.
In recent years, it has become possible to classify sentences by BERT and automatically reply to inquiries and chats. This means that the system calculates the feature value of the text, classifies the inquiry based on the feature value, and sends the reply text corresponding to each category prepared in advance.
In actual trading, get information from the rotor in real time, run it through machine learning and classify it into patterns 0 to 3, and take a position if the pattern is 1 or 2. If the pattern is 1 or 2, take a position. That’s the idea. For the rest of the article, see Implementing with Machine Learning.