Bitcoin/BTC 4750%+ , Etherium/ETH 11,270%+ profit in 1023 days using Neural Networks, Algorithmic Trading Vs/+ Machine Learning Models Vs/+ Deep Learning Model Part — 4 (TCN, LSTM, Transformer with Ensemble Method)

Puranam Pradeep Picasso - ImbueDesk Profile

42 min readMar 12, 2024

Unleashing the power of Neural Networks for creating Trading Bot for maximum profits.

Introduction:

In this article, we delve deeper into the realm of algorithmic trading and machine learning by exploring the effectiveness of neural networks, specifically TCN (Temporal Convolutional Network), LSTM (Long Short-Term Memory), Transformer, and Ensemble Techniques. Building upon the successes of our previous endeavors, we aim to achieve remarkable profits with Bitcoin/BTC and Etherium/ETH, showcasing the power of these advanced models in financial markets.

We have used bitcoin pricing in USDT with 15 minute candles time frame from January 1st 2021 to October 22nd 2023, a total of 1022 days data with more than 97,000+ rows with 190+ features to calculate long, short positions by using neural network model for prediction.

Our story is one of relentless innovation, fueled by a burning desire to unlock the full potential of Deep Learning in the pursuit of profit. In this article, we invite you to join us as we unravel the exciting tale of our transformation from humble beginnings to groundbreaking success.

Source — Google Search, Neural Networks using LSTM for time series data with 1 output

Our Algorithmic Trading Vs/+ Machine Learning Vs/+ Deep Learning Journey so far?

Stage 1:

We have developed a crypto Algorithmic Strategy which gave us huge profits when ran on multiple crypto assets (138+) with a profit range of 8787%+ in span of 3 years (almost).

“The 8787%+ ROI Algo Strategy Unveiled for Crypto Futures! Revolutionized With Famous RSI, MACD, Bollinger Bands, ADX, EMA” — Link

We have run live trading in dry-run mode for the same for 7 days and details about the same have been shared in another article.

“Freqtrade Revealed: 7-Day Journey in Algorithmic Trading for Crypto Futures Market” — Link

After successful backtest results and forward testing (live trading in dry-run mode), we planned to improve the odds of making more profit for the same. (To lower stop-losses, increase odds of winning more , reduce risk factor and other important things)

Stage 2:

We have worked on developing a strategy alone without freqtrade setup (avoiding trailing stop loss, multiple asst parallel running, higher risk management setups that freqtrade provides for free (it is a free open source platform) and then tested it in market, then optimized it using hyper parameters and then , we got some +ve profits from the strategy

“How I achieved 3000+% Profit in Backtesting for Various Algorithmic Trading Bots and how you can do the same for your Trading Strategies — Using Python Code” — Link

Stage 3:

As we have tested our strategy only on 1 Asset , i.e; BTC/USDT in crypto market, we wanted to know if we can segregate the whole collective assets we have (Which we have used for developing Freqtrade Strategy earlier) segregate them into different clusters based on their volatility, it becomes easy to do trading for certain volatile assets and won’t hit huge stop-losses for others if worked on implementing based on coin volatility.

We used K-nearest Neighbors (KNN Means) to identify different clusters of assets out of 138 crypto assets we use in our freqtrade strategy, which gave us 8000+% profits during backtest.

“Hyper Optimized Algorithmic Strategy Vs/+ Machine Learning Models Part -1 (K-Nearest Neighbors)” — Link

Stage 4:

Now, we want to introduce Unsupervised Machine Learning model — Hidden Markov Model (HMMs) to identify trends in the market and trade during only profitable trends and avoid sudden pumps, dumps in market, avoid negative trends in market. Below explanation unravels the same.

“Hyper Optimized Algorithmic Strategy Vs/+ Machine Learning Models Part -2 (Hidden Markov Model — HMM)” — Link

Stage 5:

I worked on using XGBoost Classifier to identify long and short trades using our old signal. Before using it, we ensured that the signal algorithm we had previously developed was hyper-optimized. Additionally, we introduced different stop-loss and take-profit parameters for this setup, causing the target values to change accordingly. We also adjusted the parameters used for obtaining profitable trades based on the stop-loss and take-profit values. Later, we tested the basic XGBClassifier setup and then enhanced the results by adding re-sampling methods. Our target classes, which include 0’s (neutral), 1’s (for long trades), and 2’s (for short trades), were imbalanced due to the trade execution timing. To address this imbalance, we employed re-sampling methods and performed hyper-optimization of the classifier model. Subsequently, we evaluated if the model performed better with other classifier models such as SVC, CatBoost, and LightGBM, in combination with LSTM and XGBoost. Finally, we concluded by analyzing the results and determining feature importance parameters to identify the most productive features.

“Hyper Optimized Algorithmic Strategy Vs/+ Machine Learning Models Part -3 (XGBoost Classifier , LGBM Classifier, CatBoost Classifier, SVC, LSTM with XGB and Multi level Hyper-optimization)” — Link

Stage 6:

In that stage, I utilized the CatBoostClassifier along with resampling and sample weights. I incorporated multiple time frame indicators such as volume, momentum, trend, and volatility into my model. After running the model, I performed ensembling techniques to enhance its overall performance. The results of my analysis showed a significant increase in profit from 54% to over 4600% during backtesting. Additionally, I highlighted the impressive performance metrics including recall, precision, accuracy, and F1 score, all exceeding 80% for each of the three trading classes (0 for neutral, 1 for long, and 2 for short trades).

“From 54% to a Staggering 4648%: Catapulting Cryptocurrency Trading with CatBoost Classifier, Machine Learning Model at Its Best” — Link

get entire code and profitable algos @ https://patreon.com/pppicasso

The code Explanation:

Data Pre-processing and feature engineering is as same as I did in my previous article, I will provide the article link below to follow the steps, I don’t want to increase the length of this article by providing same data again. Sorry for any inconvenience caused.

Link for Pre-Processing and Feature Engineering (it is as same as mentioned in this article) — Link

Scaling and splitting the dataframe for training and testing:

scaler = MinMaxScaler(feature_range=(0,1))

df_model = df.copy()
# Split into Learning (X) and Target (y) Data
X = df_model.iloc[:, : -1]
y = df_model.iloc[:, -1]

X_scaled = scaler.fit_transform(X)

# Define a function to reshape the data
def reshape_data(data, time_steps):
    samples = len(data) - time_steps + 1
    reshaped_data = np.zeros((samples, time_steps, data.shape[1]))
    for i in range(samples):
        reshaped_data[i] = data[i:i + time_steps]
    return reshaped_data

# Reshape the scaled X data
time_steps = 1  # Adjust the number of time steps as needed
X_reshaped = reshape_data(X_scaled, time_steps)

# Now X_reshaped has the desired three-dimensional shape: (samples, time_steps, features)
# Each sample contains scaled data for a specific time window

# Align y with X_reshaped by discarding excess target values
y_aligned = y[time_steps - 1:]  # Discard the first (time_steps - 1) target values

X = X_reshaped
y = y_aligned

print(len(X),len(y))

# Check if X_reshaped and y have the same length
if len(X) == len(y):
    print("X_reshaped and y have the same length.")
else:
    print("X_reshaped and y have different lengths.")

# Split data into train and test sets (considering time series data)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, shuffle=False)

In the provided code, the MinMaxScaler from scikit-learn is used to scale the input features to a specified range, typically between 0 and 1. This is done to ensure that all features have a similar scale, which can help improve the performance of machine learning algorithms.

A copy of the original DataFrame df is created, named df_model, to avoid modifying the original data. The DataFrame is then split into two parts: the input features (X) and the target variable (y).

Next, the input features (X) are scaled using the MinMaxScaler, and the scaled data is stored in X_scaled.

A function reshape_data is defined to reshape the input data into a three-dimensional array, suitable for training recurrent neural networks (RNNs) or other models that require input in the form of sequences. This function takes the scaled data and a parameter time_steps, which determines the number of time steps used to create each sample.

The scaled input data (X_scaled) is reshaped using the reshape_data function to create X_reshaped, which has a three-dimensional shape (samples, time_steps, features).

To align the target variable (y) with the reshaped input data (X_reshaped), excess target values corresponding to the discarded time steps are removed. The aligned target variable is stored in y_aligned.

The length of X_reshaped and y_aligned is checked to ensure they have the same length, which is necessary for further processing.

Finally, the data is split into training and testing sets using the train_test_split function from scikit-learn. Since the data is time series data, it is split without shuffling to maintain the temporal order.

Transformer model for Neural Networks with manual optimization:

from keras.layers import Input, Dense, Dropout
from keras.models import Model
from keras.optimizers import Adam
from keras.metrics import Precision, Recall
from keras_self_attention import SeqSelfAttention
from keras.utils import to_categorical
from tensorflow.keras.layers import MultiHeadAttention

class_weights = {0: 1, 1: 3, 2: 3}  # Adjust weights as needed

# Define Transformer-based model with multiple hidden layers
def build_transformer_model(input_shape, units=193, dropout=0.2, lr=0.0001):
    inputs = Input(shape=input_shape)
    attention = MultiHeadAttention(num_heads=8, key_dim=64)(inputs, inputs)
    hidden = Dense(units, activation='relu')(attention)
    dropout_layer = Dropout(dropout)(hidden)
    
    # First hidden layer
    dense_layer_1 = Dense(units=96, activation='relu')(dropout_layer)
    dropout_layer_1 = Dropout(dropout)(dense_layer_1)
    
    # Second hidden layer
    dense_layer_2 = Dense(units=96, activation='relu')(dropout_layer_1)
    dropout_layer_2 = Dropout(dropout)(dense_layer_2)
    
    # Third hidden layer
    dense_layer_3 = Dense(units=48, activation='relu')(dropout_layer_2)
    dropout_layer_3 = Dropout(dropout)(dense_layer_3)
    
    # Output layer
    outputs = Dense(3, activation='softmax')(dropout_layer_3)
    
    model = Model(inputs=inputs, outputs=outputs)
    optimizer = Adam(learning_rate=lr)
    model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    return model

# Convert y_train to one-hot encoded format
y_train_one_hot = to_categorical(y_train, num_classes=3)

# Instantiate the model
model_transformer = build_transformer_model(input_shape=(X_train.shape[1], X_train.shape[2]))

# Fit the model to the training data
model_transformer.fit(X_train, y_train, epochs=20, batch_size=32, validation_split=0.2, verbose=1, class_weight=class_weights)

In the provided code, a Transformer-based model is implemented using Keras for predicting Bitcoin classification, specifically taking long, short, or neutral positions. Here’s a breakdown of the code:

Model Architecture: The model architecture consists of multiple hidden layers with dense connections. At the beginning of the model, a MultiHeadAttention layer is used to capture long-range dependencies in the input time series data. This layer helps the model attend to different parts of the input sequence simultaneously, which can be beneficial for capturing complex patterns in time series data.
Dense Layers: After the MultiHeadAttention layer, several dense layers with ReLU activation functions are added. ReLU (Rectified Linear Unit) activation is chosen because it introduces non-linearity to the model and helps mitigate the vanishing gradient problem. The dense layers allow the model to learn higher-level representations of the input features extracted by the attention mechanism.
Dropout Layers: Dropout layers are added after each dense layer to prevent overfitting by randomly dropping a fraction of input units during training. This regularization technique helps improve the generalization performance of the model.
Output Layer: The final layer of the model is a dense layer with a softmax activation function, which produces probability distributions over the three classes (long, short, neutral). This layer enables the model to output probabilities for each class, facilitating classification.
Training: The model is trained using the Adam optimizer with a specified learning rate (lr) and a sparse categorical cross-entropy loss function. Additionally, class weights are incorporated to address class imbalance in the training data.
Alternative Approaches:

Alternative Attention Mechanisms: Besides MultiHeadAttention, other attention mechanisms such as Self-Attention or SeqSelfAttention could be used.
Activation Functions: Apart from ReLU, other activation functions like Leaky ReLU, ELU (Exponential Linear Unit), or SELU (Scaled Exponential Linear Unit) could be explored based on the specific characteristics of the data and the model’s performance.

The model.compile function is used to configure the training process for a neural network model before it's trained. Let's break down the arguments passed to model.compile:

Optimizer: The optimizer argument specifies the optimization algorithm to be used during training. In this case, the Adam optimizer is passed as the optimizer. Adam is a popular optimization algorithm that adapts the learning rate during training, making it well-suited for a wide range of deep learning tasks. The optimizer variable is typically initialized earlier in the code with specific parameters, such as the learning rate.
Loss Function: The loss argument defines the loss function that the optimizer will minimize during training. Here, 'sparse_categorical_crossentropy' is specified as the loss function. This loss function is commonly used for multi-class classification problems where the target labels are integers (e.g., 0, 1, 2) and the model predicts class probabilities. The categorical cross-entropy loss measures the discrepancy between the true class labels and the predicted probabilities, penalizing incorrect predictions.
Metrics: The metrics argument specifies the evaluation metrics to be computed during training and validation. In this case, ['accuracy'] is passed as the metric. Accuracy is a commonly used metric for classification tasks, which measures the proportion of correctly classified samples out of the total number of samples. During training, the accuracy metric will be computed and displayed to monitor the model's performance.

The model.compile function sets up the neural network model for training by specifying the optimizer, loss function, and evaluation metric(s) to be used during the training process. These choices are crucial for guiding the optimization process and assessing the performance of the model during training and evaluation.

Overall, the Transformer-based model with attention mechanisms and dense layers offers a powerful approach for modeling temporal dependencies in time series data, which can be particularly useful for predicting Bitcoin price movements and making trading decisions.

after 20 epochs of data being optimized based on accuracy metrics

Plotting Confusion-matrix and classification report for the transformer neural network model:

from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import classification_report


# Perform prediction on the original shape data
y_pred = model_transformer.predict(X_test)


# Perform any necessary post-processing on y_pred if needed
# For example, if your model outputs probabilities, you might convert them to class labels using argmax:

y_pred_classes = np.argmax(y_pred, axis=2)

# Convert one-hot encoded y_test to class labels
y_test_classes = y_test

# Plot confusion matrix for test data
conf_matrix_test = confusion_matrix(y_test_classes, y_pred_classes)

# Plot confusion matrix
plt.figure(figsize=(8, 6))
sns.heatmap(conf_matrix_test, annot=True, cmap='Blues', fmt='g', cbar=False)
plt.xlabel('Predicted labels')
plt.ylabel('True labels')
plt.title('Confusion Matrix - Test Data')
plt.show()

# Generate classification report for test data
class_report = classification_report(y_test, y_pred_classes)

# Print classification report
print("Classification Report - Test Data:\n", class_report)

In the provided code snippet, the model’s performance is evaluated and visualized using a confusion matrix and a classification report. Here’s a breakdown of the steps:

Perform Prediction: The trained model (model_transformer) is used to predict the classes for the test data (X_test). These predictions are stored in y_pred.
Post-processing Predictions: If the model outputs probabilities for each class, the argmax function is applied to convert these probabilities into class labels. This step ensures that each prediction corresponds to a specific class.
Convert Target Labels: The true class labels for the test data are converted from one-hot encoded format (y_test) to class labels (y_test_classes). This step aligns the format of the true labels with the predicted labels for comparison.
Plot Confusion Matrix: A confusion matrix is computed based on the true labels (y_test_classes) and the predicted labels (y_pred_classes). The confusion matrix provides insights into the performance of the classifier by visualizing the counts of true positive, false positive, true negative, and false negative predictions for each class.
Visualize Confusion Matrix: The confusion matrix is plotted using matplotlib and seaborn libraries to create a heatmap. Each cell in the heatmap represents the count of predictions for a specific combination of true and predicted class labels.
Generate Classification Report: A classification report is generated using the classification_report function from sklearn.metrics. This report provides comprehensive metrics such as precision, recall, F1-score, and support for each class, along with the overall accuracy.
Print Classification Report: The classification report for the test data is printed to the console, summarizing the model’s performance across different metrics for each class.

Overall, these steps allow for a thorough evaluation of the model’s performance on the test data, providing insights into its strengths and weaknesses in classifying different classes.

confusion matrix and classification report for transformer neural network model

Backtest with test data for the transformer neural network model:

df_ens_test = df.copy() 

df_ens = df_ens_test[len(X_train):]

df_ens['transformer_neural_scaled'] =  np.argmax(model_transformer.predict(X_test), axis=2)

df_ens['trns'] = df_ens['transformer_neural_scaled'].shift(1).dropna().astype(int)

df_ens = df_ens.dropna()

df_ens['trns']

# df_ens = df.copy() 

# # df_ens = df_ens_test[len(X_train):]

# df_ens['transformer_neural_scaled'] =  np.argmax(model_transformer.predict(X), axis=2)

# df_ens['trns'] = df_ens['transformer_neural_scaled'].shift(-1).dropna().astype(int)

# df_ens = df_ens.dropna()

# df_ens['trns']


df_ens = df_ens.reset_index(inplace=False)
df_ens['Date'] = pd.to_datetime(df_ens['Date'])
df_ens.set_index('Date', inplace=True)

def SIGNAL_1(df_ens):
    return df_ens['trns']

class MyCandlesStrat_1(Strategy):  
    def init(self):
        super().init()
        self.signal1_1 = self.I(SIGNAL_1, self.data)
    
    def next(self):
        super().next() 
        if self.signal1_1 == 1:
            sl_pct = 0.1  # 10% stop-loss
            tp_pct = 0.025  # 2.5% take-profit
            sl_price = self.data.Close[-1] * (1 - sl_pct)
            tp_price = self.data.Close[-1] * (1 + tp_pct)
            self.buy(sl=sl_price, tp=tp_price)
        elif self.signal1_1 == 2:
            sl_pct = 0.1  # 10% stop-loss
            tp_pct = 0.025  # 2.5% take-profit
            sl_price = self.data.Close[-1] * (1 + sl_pct)
            tp_price = self.data.Close[-1] * (1 - tp_pct)
            self.sell(sl=sl_price, tp=tp_price)

            
bt_1 = Backtest(df_ens, MyCandlesStrat_1, cash=100000, commission=.001)
stat_1 = bt_1.run()
stat_1

The provided code snippet performs the following actions:

Create DataFrame for Ensemble Testing: A copy of the original DataFrame (df) is made and stored in df_ens_test.
Extract Test Data for Ensemble Testing: Data for ensemble testing is extracted from df_ens_test by selecting the portion of data not used in training (len(X_train):). This portion is stored in df_ens.
Perform Prediction with Transformer Neural Network Model: The trained transformer neural network model (model_transformer) is used to predict classes for the test data (X_test). The predictions are converted into class labels using argmax, and the resulting labels are stored in a new column named 'transformer_neural_scaled' in df_ens.
Shift and Clean Data: The values in the 'transformer_neural_scaled' column are shifted by one time step and stored in a new column named 'trns'. Rows with missing values are dropped from df_ens.
Reset Index and Convert Date Column: The index of df_ens is reset, and the 'Date' column is converted to datetime format and set as the index.
Define Signal Function: A function named SIGNAL_1 is defined to return the values in the 'trns' column.
Define Strategy Class: A custom strategy class named MyCandlesStrat_1 is defined, inheriting from the Strategy class. In the init method, the signal function SIGNAL_1 is initialized to obtain signals for trading. In the next method, trading decisions are made based on the signals obtained. If the signal is 1, a buy order is executed with specified stop-loss and take-profit percentages. If the signal is 2, a sell order is executed with corresponding stop-loss and take-profit percentages.
Backtest the Strategy: A backtest is performed using the Backtest class from the bt library. The backtest is run using the data from df_ens and the custom strategy MyCandlesStrat_1, with an initial cash amount of 100,000 and a commission rate of 0.001.
Retrieve Backtest Statistics: The backtest statistics are stored in stat_1, containing information such as final equity, total return, number of trades, and various performance metrics.

Overall, this code segment demonstrates the process of preparing data, defining a trading strategy, and conducting a backtest to evaluate the performance of the strategy based on signals generated by a transformer neural network model.

Backtest Result for Transformer Neural network model which is built using manual tuning

The provided result outlines various performance metrics and statistics obtained from running a backtest on a trading strategy. Here’s an explanation of the key metrics:

Duration: The duration of the backtest period, spanning from December 19, 2022, to October 22, 2023, for a total of 306 days and 22 hours.
Exposure Time [%]: The percentage of time the strategy was exposed to the market, indicating that the strategy was active for almost the entire duration of the backtest.
Equity Final [$]: The final equity or account balance after executing the trading strategy, amounting to $146,262.499.
Return [%]: The percentage return on investment (ROI) generated by the strategy over the backtest period, which is approximately 46.26%.
Sharpe Ratio: A measure of the risk-adjusted return of the strategy, indicating how well the strategy performed relative to its risk. A Sharpe Ratio of 0.706 suggests a moderate level of risk-adjusted performance.
Max. Drawdown [%]: The maximum percentage decline in equity from a peak value to a trough during the backtest period, amounting to -33.81%. This indicates the largest loss experienced by the strategy at any point.
# Trades: The total number of trades executed by the strategy during the backtest period, which is 104.
Win Rate [%]: The percentage of trades that resulted in a profit, indicating a win rate of approximately 79.81%.
Profit Factor: The ratio of total profit to total loss generated by the strategy, suggesting a profit factor of 1.12, indicating that for every unit lost, the strategy gained approximately 1.12 units.
Strategy: The name of the strategy used for the backtest, in this case, “MyCandlesStrat_1”.

Overall, these metrics provide insights into the performance, risk, and efficiency of the trading strategy during the specified backtest period.

Party-Conclusion:

The above results did not beat the benchmark returns of 80% during the same time period, we will try to see by implementing hyper-optimization of the above model and check if we can outperform boy & hold returns for the same time period.

Hyper-Optimization, Finding confusion-matrix, classification report, then backtesting the results and save the model of Transformer Neural Network Model for Time-Series Data:

# from torch import dropout
import kerastuner as kt
from keras.layers import Input, Dense, Dropout
from keras.models import Model
from keras.optimizers import Adam
# from keras.metrics import Precision, Recall
# from keras_self_attention import SeqSelfAttention
from keras.utils import to_categorical
from tensorflow.keras.layers import MultiHeadAttention

class_weights = {0: 1, 1: 3, 2: 3}  # Adjust weights as needed

# Define Transformer-based model with multiple hidden layers
def build_transformer_model(hp):
    inputs = Input(shape=(X_train.shape[1], X_train.shape[2]))
    attention = MultiHeadAttention(num_heads=8, key_dim=64)(inputs, inputs)
    
    # Define hyperparameters for each dense layer
    units = hp.Int('units', min_value=32, max_value=256, step=32)
    dropout_rate = hp.Float('dropout', min_value=0.1, max_value=0.5, step=0.1)
    hidden = Dense(units, activation='relu')(attention)
    dropout_layer = Dropout(dropout_rate)(hidden)
    
    for i in range(2):
        units = hp.Int(f'units_{i}', min_value=32, max_value=256, step=32)
        dropout_rate = hp.Float(f'dropout_{i}', min_value=0.1, max_value=0.5, step=0.1)
        dense_layer = Dense(units=units, activation='relu')(dropout_layer)
        dropout_layer = Dropout(dropout_rate)(dense_layer)
    
    # Output layer
    outputs = Dense(3, activation='softmax')(dropout_layer)
    
    model = Model(inputs=inputs, outputs=outputs)
    optimizer = Adam(learning_rate=hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4]))
    model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    return model

# Define hyperband tuner
tuner = kt.Hyperband(
    build_transformer_model,
    objective='val_accuracy',
    max_epochs=20,
    factor=3,
    directory='my_dir',
    project_name='transformer_hyperopt'
)

# Search for the best hyperparameters
tuner.search(X_train, y_train, epochs=20, batch_size=32, validation_split=0.2, verbose=1, class_weight=class_weights)

# Get the best hyperparameters
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]

# Build the model with the best hyperparameters and train it
best_model = tuner.hypermodel.build(best_hps)
best_model.fit(X_train, y_train, epochs=20, batch_size=32, validation_split=0.2, verbose=1, class_weight=class_weights)

The provided code outlines the process of hyperparameter tuning for a Transformer-based neural network model using Keras Tuner. Here’s an explanation of the code:

Importing Libraries: The code imports necessary libraries such as Keras Tuner, Keras layers, and TensorFlow.
Class Weights: Defines class weights to handle class imbalance in the dataset.
Define Transformer Model: The build_transformer_model function defines the architecture of the Transformer-based neural network. It takes hyperparameters (hp) as input, including the number of units, dropout rate, and learning rate. The model architecture consists of multiple hidden layers, with attention layers, dense layers, and dropout layers.
Hyperband Tuner Setup: Initializes a Hyperband tuner (kt.Hyperband) to search for the best hyperparameters. Hyperband is an algorithm for hyperparameter optimization that adapts the resource allocation for each configuration dynamically.
Hyperparameter Search: The tuner searches for the best hyperparameters (tuner.search) by evaluating the model's performance on the training data (X_train, y_train) for a specified number of epochs and batch size. It also utilizes a validation split of the data for validation during training.
Get Best Hyperparameters: Retrieves the best hyperparameters (best_hps) found during the search process.
Build and Train Best Model: Constructs the final model using the best hyperparameters (tuner.hypermodel.build(best_hps)) and trains it (best_model.fit) on the training data (X_train, y_train). The training process includes a validation split and utilizes class weights to handle class imbalance.

Overall, this code demonstrates an automated approach to finding the optimal hyperparameters for a Transformer-based neural network model, which can improve the model’s performance and generalization ability.

from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import classification_report

# Perform prediction on the original shape data
y_pred = best_model.predict(X_test)


# Perform any necessary post-processing on y_pred if needed
# For example, if your model outputs probabilities, you might convert them to class labels using argmax:

y_pred_classes = np.argmax(y_pred, axis=2)

# Convert one-hot encoded y_test to class labels
y_test_classes = y_test

# Plot confusion matrix for test data
conf_matrix_test = confusion_matrix(y_test_classes, y_pred_classes)

# Plot confusion matrix
plt.figure(figsize=(8, 6))
sns.heatmap(conf_matrix_test, annot=True, cmap='Blues', fmt='g', cbar=False)
plt.xlabel('Predicted labels')
plt.ylabel('True labels')
plt.title('Confusion Matrix - Test Data')
plt.show()

# Generate classification report for test data
class_report = classification_report(y_test, y_pred_classes)

# Print classification report
print("Classification Report - Test Data:\n", class_report)

confusion matrix and classification report for hyper optimized transformer neural network model

df_ens_test = df.copy() 

df_ens = df_ens_test[len(X_train):]

df_ens['Hyperopt_transformer_neural_scaled'] =  np.argmax(best_model.predict(X_test), axis=2)

df_ens['htrns'] = df_ens['Hyperopt_transformer_neural_scaled'].shift(1).dropna().astype(int)

df_ens = df_ens.dropna()

df_ens['htrns']

# df_ens = df.copy() 

# # df_ens = df_ens_test[len(X_train):]

# df_ens['Hyperopt_transformer_neural_scaled'] =  np.argmax(best_model.predict(X), axis=2)

# df_ens['htrns'] = df_ens['Hyperopt_transformer_neural_scaled'].shift(1).dropna().astype(int)

# df_ens = df_ens.dropna()

# df_ens['htrns']

df_ens = df_ens.reset_index(inplace=False)
df_ens['Date'] = pd.to_datetime(df_ens['Date'])
df_ens.set_index('Date', inplace=True)

def SIGNAL_11(df_ens):
    return df_ens['htrns']

class MyCandlesStrat_11(Strategy):  
    def init(self):
        super().init()
        self.signal1_1 = self.I(SIGNAL_11, self.data)
    
    def next(self):
        super().next() 
        if self.signal1_1 == 1:
            sl_pct = 0.1  # 10% stop-loss
            tp_pct = 0.025  # 2.5% take-profit
            sl_price = self.data.Close[-1] * (1 - sl_pct)
            tp_price = self.data.Close[-1] * (1 + tp_pct)
            self.buy(sl=sl_price, tp=tp_price)
        elif self.signal1_1 == 2:
            sl_pct = 0.1  # 10% stop-loss
            tp_pct = 0.025  # 2.5% take-profit
            sl_price = self.data.Close[-1] * (1 + sl_pct)
            tp_price = self.data.Close[-1] * (1 - tp_pct)
            self.sell(sl=sl_price, tp=tp_price)

            
bt_11 = Backtest(df_ens, MyCandlesStrat_11, cash=100000, commission=.001)
stat_11 = bt_11.run()
stat_11

backtest results for Transformer Neural Network model with hyper optimization done

Party-Conclusion from above backtest results:

In the provided backtest results, the trading strategy based on the Transformer neural network model exhibited promising performance. Here’s a comparison with the previously shared backtest result and an explanation of the significance of hyperparameter optimization:

Performance Comparison: Compared to the previous backtest result, this one showed improvement in several key metrics. The return percentage increased significantly from 46.26% to 75.26%, indicating higher profitability. The Sharpe ratio also improved from 0.707 to 1.026, suggesting better risk-adjusted returns. Additionally, the Sortino ratio increased from 2.178 to 3.857, indicating superior performance in generating positive returns relative to downside risk.
Hyperparameter Optimization: The enhanced performance in the backtest results can be attributed to the use of hyperparameter optimization. By fine-tuning the hyperparameters of the Transformer neural network model, such as the number of units, dropout rate, and learning rate, through techniques like the Hyperband tuner, the model’s performance was optimized. Hyperparameter optimization helps in finding the most suitable configuration for the model, leading to improved accuracy, robustness, and generalization ability.
Importance of Hyperparameter Optimization: Hyperparameter optimization is crucial in machine learning model development as it allows the model to adapt to the characteristics of the data and the problem at hand. It helps in overcoming issues such as overfitting, underfitting, and model instability by finding the optimal values for hyperparameters. Through hyperparameter optimization, the model can achieve better convergence, enhanced performance, and increased efficiency, ultimately leading to improved trading strategy outcomes as observed in the backtest results.

Overall, the backtest results demonstrate the effectiveness of the Transformer neural network model in generating profitable trading signals, with hyperparameter optimization playing a crucial role in maximizing its performance and profitability.

Save the model:

from keras.models import save_model

# save the trasnformer hyperopt model
best_model.save('best_model.h5')

Manual Optimization, Finding Confusion-matrix, Classification Report, and then Backtesting the results of TCN (Temporal Convolutional Network) for Time-Series Data:

from keras.layers import Input, Dense, Dropout
from keras.models import Model
from keras.optimizers import Adam
from keras.metrics import Precision, Recall
from tcn import TCN
from keras.utils import to_categorical

class_weights = {0: 1, 1: 3, 2: 3}  # Adjust weights as needed

# Define TCN model with different architecture
def build_tcn_model(input_shape, units=193, num_layers=4, dropout=0.2, lr=0.0001):
    inputs = Input(shape=input_shape)
    x = inputs
    for _ in range(num_layers):
        x = TCN(return_sequences=True)(x)  # Use return_sequences=True for all layers except the last
    x = TCN(return_sequences=False)(x)  # Last TCN layer with return_sequences=False
    x = Dense(units=48, activation='relu')(x)
    x = Dropout(dropout)(x)
    outputs = Dense(3, activation='softmax')(x)
    model = Model(inputs=inputs, outputs=outputs)
    optimizer = Adam(learning_rate=lr)
    model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy', Precision(), Recall()])
    return model

# Convert y_train to one-hot encoded format
y_train_one_hot = to_categorical(y_train, num_classes=3)

# Instantiate the model
model_tcn = build_tcn_model(input_shape=(X_train.shape[1], X_train.shape[2]))

# Fit the model to the training data
model_tcn.fit(X_train, y_train_one_hot, epochs=20, batch_size=32, validation_split=0.2, verbose=1, class_weight=class_weights)

Here’s a line-by-line explanation of the provided code, along with reasons for using specific features:

Import Statements:

from keras.layers import Input, Dense, Dropout: These statements import the necessary layers and functionalities from the Keras library to build the neural network model.
from keras.models import Model: This statement imports the Model class from Keras, which is used to define and compile the neural network architecture.
from keras.optimizers import Adam: It imports the Adam optimizer, which is commonly used for training neural networks due to its adaptive learning rate capabilities.
from keras.metrics import Precision, Recall: These statements import the Precision and Recall metrics from Keras, which are useful for evaluating the performance of classification models.
from tcn import TCN: This statement imports the Temporal Convolutional Network (TCN) layer from the specified package, which is used as a building block for the model architecture.
from keras.utils import to_categorical: This statement imports a utility function from Keras to convert class vectors to one-hot encoded format, which is often used for multi-class classification tasks.

Class Weights:

class_weights = {0: 1, 1: 3, 2: 3}: This dictionary defines the class weights for the loss function. It assigns higher weights to classes 1 and 2 compared to class 0, indicating that misclassifications for classes 1 and 2 are penalized more.

Model Building Function (build_tcn_model):

This function constructs the TCN model with specified architecture and parameters.
inputs = Input(shape=input_shape): It defines the input layer with the specified input shape, which corresponds to the dimensions of the input data.
for _ in range(num_layers): x = TCN(return_sequences=True)(x): This loop creates multiple TCN layers with return sequences set to True, indicating that each layer returns the entire sequence of outputs.
x = TCN(return_sequences=False)(x): This statement adds a final TCN layer with return sequences set to False, indicating that only the last output of the sequence is returned.
x = Dense(units=48, activation='relu')(x): It adds a dense layer with rectified linear unit (ReLU) activation function to introduce non-linearity into the model.
x = Dropout(dropout)(x): This dropout layer helps prevent overfitting by randomly setting a fraction of input units to zero during training.
outputs = Dense(3, activation='softmax')(x): This dense layer with softmax activation function outputs the probability distribution over the three classes.
optimizer = Adam(learning_rate=lr): It initializes the Adam optimizer with the specified learning rate.
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy', Precision(), Recall()]): This statement compiles the model, specifying the optimizer, loss function (categorical cross-entropy), and evaluation metrics (accuracy, precision, and recall).

One-Hot Encoding:

y_train_one_hot = to_categorical(y_train, num_classes=3): This line converts the class labels (y_train) into one-hot encoded format, which is necessary for multi-class classification tasks.

Model Instantiation and Training:

model_tcn = build_tcn_model(input_shape=(X_train.shape[1], X_train.shape[2])): It instantiates the TCN model using the defined architecture and parameters.
model_tcn.fit(X_train, y_train_one_hot, epochs=20, batch_size=32, validation_split=0.2, verbose=1, class_weight=class_weights): This line trains the model on the training data (X_train) and corresponding labels (y_train_one_hot) using mini-batch gradient descent, with specified training parameters such as number of epochs, batch size, validation split, verbosity, and class weights.

based on accuracy, precision, recall metrics, tuning of TCN neural network model for time series data is done

from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sns

# # Reshape X_train and X_test back to their original shapes
# X_train_original_shape = X_train.reshape(X_train.shape[0], -1)
# X_test_original_shape = X_test.reshape(X_test.shape[0], -1)

# X_test_reshaped = X_test_original_shape.reshape(-1, 1, X_test_original_shape.shape[1])


# Now X_train_original_shape and X_test_original_shape have their original shapes

# Perform prediction on the original shape data
# y_pred = model.predict(X_test_reshaped)
y_pred = model_tcn.predict(X_test)


# Perform any necessary post-processing on y_pred if needed
# For example, if your model outputs probabilities, you might convert them to class labels using argmax:

y_pred_classes = np.argmax(y_pred, axis=1)

# Convert one-hot encoded y_test to class labels
y_test_classes = y_test

# Plot confusion matrix for test data
conf_matrix_test = confusion_matrix(y_test_classes, y_pred_classes)

# Plot confusion matrix
plt.figure(figsize=(8, 6))
sns.heatmap(conf_matrix_test, annot=True, cmap='Blues', fmt='g', cbar=False)
plt.xlabel('Predicted labels')
plt.ylabel('True labels')
plt.title('Confusion Matrix - Test Data')
plt.show()


from sklearn.metrics import classification_report

# Generate classification report for test data
class_report = classification_report(y_test, y_pred_classes)

# Print classification report
print("Classification Report - Test Data:\n", class_report)

confusion-matrix and classification report for TCN neural network model

df_ens_test = df.copy() 

df_ens = df_ens_test[len(X_train):]

df_ens['tcn_neural_scaled'] =  np.argmax(model_tcn.predict(X_test), axis=1)

df_ens['tns'] = df_ens['tcn_neural_scaled'].shift(1).dropna().astype(int)

df_ens = df_ens.dropna()

df_ens['tns']

# df_ens = df.copy() 

# # df_ens = df_ens_test[len(X_train):]

# df_ens['tcn_neural_scaled'] =  np.argmax(model_tcn.predict(X), axis=1)

# df_ens['tns'] = df_ens['tcn_neural_scaled'].shift(-1).dropna().astype(int)

# df_ens = df_ens.dropna()

# df_ens['tns']

df_ens = df_ens.reset_index(inplace=False)
df_ens['Date'] = pd.to_datetime(df_ens['Date'])
df_ens.set_index('Date', inplace=True)

def SIGNAL_2(df_ens):
    return df_ens['tns']

class MyCandlesStrat_2(Strategy):  
    def init(self):
        super().init()
        self.signal1_1 = self.I(SIGNAL_2, self.data)
    
    def next(self):
        super().next() 
        if self.signal1_1 == 1:
            sl_pct = 0.1  # 10% stop-loss
            tp_pct = 0.025  # 2.5% take-profit
            sl_price = self.data.Close[-1] * (1 - sl_pct)
            tp_price = self.data.Close[-1] * (1 + tp_pct)
            self.buy(sl=sl_price, tp=tp_price)
        elif self.signal1_1 == 2:
            sl_pct = 0.1  # 10% stop-loss
            tp_pct = 0.025  # 2.5% take-profit
            sl_price = self.data.Close[-1] * (1 + sl_pct)
            tp_price = self.data.Close[-1] * (1 - tp_pct)
            self.sell(sl=sl_price, tp=tp_price)

            
bt_2 = Backtest(df_ens, MyCandlesStrat_2, cash=100000, commission=.001)
stat_2 = bt_2.run()
stat_2

backtest result for TCN neueral network model

get entire code and profitable algos @ https://patreon.com/pppicasso

The provided backtest results showcase the performance of a trading strategy based on the TCN (Temporal Convolutional Network) neural network model, manually tuned for optimal performance. Here’s a detailed breakdown of the results and their significance:

Duration and Exposure Time: The trading strategy was executed over a period of 306 days and was actively engaged in the market for approximately 99.99% of the time. This high exposure indicates consistent and frequent trading activity.
Equity Growth: The final equity reached $186,325.343, showcasing a significant return of 86.33%. This growth outperformed the buy and hold strategy, which yielded a return of 80.06%. The strategy capitalized on market movements to generate substantial profits.
Risk-Adjusted Returns: The annualized return stood at 106.90%, indicating the strategy’s robust performance over a yearly period. Moreover, the Sharpe ratio of 1.08 and Sortino ratio of 4.47 highlight the strategy’s superior risk-adjusted returns, suggesting efficient risk management practices.
Drawdown Analysis: Despite the impressive returns, the strategy experienced a maximum drawdown of -27.05%, which is relatively moderate considering the high returns generated. The average drawdown duration was short, indicating a swift recovery from drawdown periods.
Trading Performance: The strategy executed a total of 109 trades, with a notable win rate of 83.49%. This high win rate, coupled with an average trade return of 0.59%, indicates the effectiveness of the strategy in identifying profitable trading opportunities.
Profit Factor and SQN: The profit factor of 1.52 indicates that the strategy generated more profits than losses, further affirming its effectiveness. Additionally, the System Quality Number (SQN) of 1.53 suggests a favorable performance, considering both the strategy’s profitability and risk.

Overall, the backtest results demonstrate the superiority of the TCN neural network model-based trading strategy, manually tuned for optimal performance.

By outperforming the buy and hold strategy and surpassing the previously hyper-optimized transformer model’s results, the TCN-based strategy showcases its effectiveness in generating consistent profits and managing risk in dynamic market conditions.

For hyperparameter optimization of TCN Neural Network Model, I have provided entire code setup over my patron page. I have not done Optimization by myself (for TCN) but have tested it on other’s system, as it is taking lot of time for computation, I have planned to continue with manual tuning and make sure the results outperform buy & hold .
get entire code and profitable algos @ https://patreon.com/pppicasso

Manual Optimization, Finding Confusion-matrix, Classification Report, and then Backtesting the results of LSTM (Long Short-Term Memory) for Time-Series Data:

from keras.layers import Input, Dense, Dropout
from keras.models import Model
from keras.optimizers import Adam
from keras.layers import LSTM
from keras.metrics import Precision, Recall
from keras.utils import to_categorical

class_weights = {0: 1, 1: 3, 2: 3}  # Adjust weights as needed

# Define LSTM model
def build_lstm_model(input_shape, units=193, dropout=0.2, lr=0.0001):
    inputs = Input(shape=input_shape)
    lstm_layer = LSTM(units=units, return_sequences=True, dropout=0.2, recurrent_dropout=0.2)(inputs)
    lstm_layer_2 = LSTM(units=96, return_sequences=True, dropout=0.2, recurrent_dropout=0.2)(lstm_layer)
    lstm_layer_3 = LSTM(units=96, return_sequences=True, dropout=0.2, recurrent_dropout=0.2)(lstm_layer_2)
    lstm_layer_4 = LSTM(units=96, dropout=0.2, recurrent_dropout=0.2)(lstm_layer_3)
    dense_layer = Dense(units=48, activation='relu')(lstm_layer_4)
    dropout_layer = Dropout(dropout)(dense_layer)
    outputs = Dense(3, activation='softmax')(dropout_layer)
    model = Model(inputs=inputs, outputs=outputs)
    optimizer = Adam(learning_rate=lr)
    model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=[Precision(), 'accuracy', Recall()])
    return model

# Convert y_train to one-hot encoded format
y_train_one_hot = to_categorical(y_train, num_classes=3)

# Instantiate the model
model_lstm = build_lstm_model(input_shape=(X_train.shape[1], X_train.shape[2]))

# Fit the model to the training data
model_lstm.fit(X_train, y_train_one_hot, epochs=20, batch_size=32, validation_split=0.2, verbose=1, class_weight=class_weights)

The code begins by importing necessary modules and libraries from Keras for building and training a neural network model. These include layers like Input, Dense, LSTM, and Dropout, along with the Model class and Adam optimizer from keras.models and keras.optimizers respectively. Metrics like Precision and Recall are also imported from keras.metrics.

A dictionary named class_weights is defined to handle class imbalances in the dataset, assigning higher weights to minority classes.

Next, a function named build_lstm_model is defined to construct an LSTM neural network model. The function takes parameters such as input shape, number of units in the LSTM layers, dropout rate, recurrent dropout rate, and learning rate for the optimizer. Inside this function, an input layer is created using the Input class. Then, multiple LSTM layers are added sequentially, each configured with the specified number of units and dropout rates. The last LSTM layer is followed by a Dense layer with softmax activation for output. The model is then compiled using the Adam optimizer and sparse categorical cross-entropy loss function, along with metrics for evaluation such as precision, accuracy, and recall.

Following the model definition, the target labels (y_train) are converted to one-hot encoded format using the to_categorical function from keras.utils.

The build_lstm_model function is called to instantiate the LSTM model, passing the input shape derived from the training data.

Finally, the model is trained using the fit method. The training data (X_train and y_train) is used for model training, with additional parameters such as the number of epochs, batch size, validation split, verbosity level, and class weights for handling class imbalance.

Few Important Oibservations:

Dropout vs. Recurrent Dropout: Dropout is a regularization technique applied to the input and output connections of each LSTM unit independently. It randomly sets a fraction of input units to zero during training, which helps prevent overfitting. Recurrent dropout, on the other hand, is applied to the recurrent connections within the LSTM units. It masks the connections between time steps to prevent the model from memorizing the sequence too perfectly.
Why NO ReLU Activation in LSTM Layers: ReLU (Rectified Linear Unit) activation is not commonly used in LSTM layers because it tends to suffer from the vanishing gradient problem, especially in deep networks. LSTM layers typically use hyperbolic tangent (tanh) or sigmoid activations to control the flow of information through the gates of the LSTM cell.
Sparse Categorical Cross-Entropy vs. Categorical Cross-Entropy: Sparse categorical cross-entropy is used when the target labels are integers, representing the class indices. It does not require one-hot encoding of the target labels. On the other hand, categorical cross-entropy expects the target labels to be one-hot encoded vectors. It computes the cross-entropy loss between the true distribution and the predicted distribution.

Overall, this code segment demonstrates the process of building, compiling, and training an LSTM neural network model for a classification task, specifically designed to handle temporal sequences of data.

based on accuracy, precision, reecall metrics , LSTM neural network model optimization done

from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sns

# # Reshape X_train and X_test back to their original shapes
# X_train_original_shape = X_train.reshape(X_train.shape[0], -1)
# X_test_original_shape = X_test.reshape(X_test.shape[0], -1)

# X_test_reshaped = X_test_original_shape.reshape(-1, 1, X_test_original_shape.shape[1])


# Now X_train_original_shape and X_test_original_shape have their original shapes

# Perform prediction on the original shape data
# y_pred = model.predict(X_test_reshaped)
y_pred = model_lstm.predict(X_test)


# Perform any necessary post-processing on y_pred if needed
# For example, if your model outputs probabilities, you might convert them to class labels using argmax:

y_pred_classes = np.argmax(y_pred, axis=1)

# Convert one-hot encoded y_test to class labels
y_test_classes = y_test

# Plot confusion matrix for test data
conf_matrix_test = confusion_matrix(y_test_classes, y_pred_classes)

# Plot confusion matrix
plt.figure(figsize=(8, 6))
sns.heatmap(conf_matrix_test, annot=True, cmap='Blues', fmt='g', cbar=False)
plt.xlabel('Predicted labels')
plt.ylabel('True labels')
plt.title('Confusion Matrix - Test Data')
plt.show()


from sklearn.metrics import classification_report

# Generate classification report for test data
class_report = classification_report(y_test, y_pred_classes)

# Print classification report
print("Classification Report - Test Data:\n", class_report)

confusion matrix and classification report for LSTM neural network model

df_ens_test = df.copy() 

df_ens = df_ens_test[len(X_train):]

df_ens['lstm_neural_scaled'] =  np.argmax(model_lstm.predict(X_test), axis=1)

df_ens['lns'] = df_ens['lstm_neural_scaled'].shift(1).dropna().astype(int)

df_ens = df_ens.dropna()

df_ens['lns']

# df_ens = df.copy() 

# # df_ens = df_ens_test[len(X_train):]

# df_ens['lstm_neural_scaled'] =  np.argmax(model_lstm.predict(X), axis=1)

# df_ens['lns'] = df_ens['lstm_neural_scaled'].shift(-1).dropna().astype(int)

# df_ens = df_ens.dropna()

# df_ens['lns']

df_ens = df_ens.reset_index(inplace=False)
df_ens['Date'] = pd.to_datetime(df_ens['Date'])
df_ens.set_index('Date', inplace=True)

def SIGNAL_3(df_ens):
    return df_ens['lns']

class MyCandlesStrat_3(Strategy):  
    def init(self):
        super().init()
        self.signal1_1 = self.I(SIGNAL_3, self.data)
    
    def next(self):
        super().next() 
        if self.signal1_1 == 1:
            sl_pct = 0.1  # 10% stop-loss
            tp_pct = 0.025  # 2.5% take-profit
            sl_price = self.data.Close[-1] * (1 - sl_pct)
            tp_price = self.data.Close[-1] * (1 + tp_pct)
            self.buy(sl=sl_price, tp=tp_price)
        elif self.signal1_1 == 2:
            sl_pct = 0.1  # 10% stop-loss
            tp_pct = 0.025  # 2.5% take-profit
            sl_price = self.data.Close[-1] * (1 + sl_pct)
            tp_price = self.data.Close[-1] * (1 - tp_pct)
            self.sell(sl=sl_price, tp=tp_price)

            
bt_3 = Backtest(df_ens, MyCandlesStrat_3, cash=100000, commission=.001)
stat_3 = bt_3.run()
stat_3

backtesting results for LSTM neural network model

get entire code and profitable algos @ https://patreon.com/pppicasso

The provided backtest results showcase the performance of an LSTM neural network model with manual tuning over a specific trading period.

Start and End Date: The backtest began on December 19, 2022, and concluded on October 22, 2023, spanning a duration of 306 days and 22.5 hours.
Equity Metrics: The final equity at the end of the backtest period reached $190,000.539, with the peak equity touching $192,280.539. This translates to a return of 90.000539%, outperforming the buy and hold strategy.
Return Analysis: The annualized return stands at 111.743778%, indicating the growth rate of the investment over a one-year period.
Risk Metrics: The volatility, measured as the annualized standard deviation of returns, is 87.68232%. The Sharpe Ratio, a measure of risk-adjusted returns, is 1.274416, indicating favorable returns per unit of risk. The Sortino Ratio, which considers downside risk, is 5.290248, suggesting a robust risk-adjusted performance. The Calmar Ratio, evaluating returns against maximum drawdown, is 5.735269, portraying strong risk-adjusted returns relative to drawdown.
Drawdown Analysis: The maximum drawdown observed during the backtest was -19.483616%, occurring over a duration of 98 days and 13 hours. On average, drawdowns lasted approximately 4 days and 2 hours.
Trading Activity: A total of 110 trades were executed during the backtest period, with a win rate of 84.545455%. The best trade yielded a return of 2.419369%, while the worst trade resulted in a loss of -10.114549%. On average, each trade generated a return of 0.541986%. The maximum duration of a single trade was 34 days and 19 hours, with an average trade duration of 3 days and 3 hours.
Performance Metrics: The profit factor, calculated as the ratio of gross profits to gross losses, stands at 1.473492. The SQN (System Quality Number), a metric assessing the quality of a trading system based on the distribution of trade returns, is 1.749569, suggesting a robust and efficient trading strategy.

The provided backtest results demonstrate the effectiveness of the LSTM neural network model, showcasing superior performance compared to the TCN neural network model, buy and hold strategy, and a previously hyper-optimized transformer model. This signifies the value of manual tuning in enhancing the predictive capabilities of the LSTM model, resulting in strong risk-adjusted returns and consistent trading outcomes.

For hyperparameter optimization of LSTM neural network model, I have provided entire code setup over my patron page. I have not done Optimization by myself (for LSTM) but have tested it on other’s system, as it is taking lot of time for computation, I have planned to continue with manual tuning and make sure the results outperform buy & hold .
get entire code and profitable algos @ https://patreon.com/pppicasso

Saving multiple Models to local disk and Re-use the models for predicting other Assets in future or use for live Trading:

from keras.models import save_model, load_model

# Define file paths for saving the models
lstm_model_path = 'lstm_model.h5'
tcn_model_path = 'tcn_model.h5'
transformer_model_path = 'transformer_model.h5'

# Save the LSTM model
model_lstm.save(lstm_model_path)

# Save the TCN model
model_tcn.save(tcn_model_path)

# Save the Transformer model
model_transformer.save(transformer_model_path)


from keras.models import save_model, load_model
from tcn import TCN

# Load the LSTM model
loaded_model_lstm = load_model(lstm_model_path)

# Define a dictionary to specify custom objects
custom_objects = {'TCN': TCN}

# Load the TCN model with custom objects
loaded_model_tcn = load_model(tcn_model_path, custom_objects=custom_objects)


# Load the Transformer model
loaded_model_transformer = load_model(transformer_model_path)

Ensemble Method for Multiple Models:

Ensembling multiple models with weighted averages can really boost prediction accuracy by leveraging the strengths of each model while minimizing their weaknesses. Essentially, it’s like combining the best features of different models to create a more reliable and robust predictor.

In my case, I had three models: TCN (Temporal Convolutional Network), LSTM (Long Short-Term Memory), and a transformer model. However, due to system limitations, I couldn’t include the transformer model in the ensemble. So, I decided to demonstrate the ensembling process with just the TCN and LSTM models.

# Assuming model_transformer, model_tcn, and model_lstm are already trained
# Use each model to predict probabilities for each sample
# probs_transformer =  np.argmax(model_transformer.predict(X_test), axis=2)
probs_tcn =  np.argmax(loaded_model_tcn.predict(X_test), axis=1)
probs_lstm =  np.argmax(loaded_model_lstm.predict(X_test), axis=1)

# Define weights for each model (you can adjust these based on performance)
weights = {
    # "transformer": 0.2,
    "tcn": 0.5,
    "lstm": 0.5
}

# Combine the predicted probabilities of all models with custom weights
ensemble_probs = (
    # probs_transformer * weights["transformer"] +
                  probs_tcn * weights["tcn"] +
                  probs_lstm * weights["lstm"])

# Select the class with the highest weighted average probability for each sample
ensemble_predictions = ensemble_probs

ensemble_predictions = ensemble_predictions.astype(int)


from sklearn.metrics import confusion_matrix, classification_report
import matplotlib.pyplot as plt
import seaborn as sns

# Save ensemble predictions
np.savetxt('ensemble_predictions.csv', ensemble_predictions, delimiter=',')

# Plot confusion matrix
def plot_confusion_matrix(y_true, y_pred, classes):
    cm = confusion_matrix(y_true, y_pred)
    plt.figure(figsize=(8, 6))
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=classes, yticklabels=classes)
    plt.xlabel('Predicted Label')
    plt.ylabel('True Label')
    plt.title('Confusion Matrix')
    plt.show()

# Define class labels (assuming 0, 1, 2 for classes)
class_labels = ['Class 0', 'Class 1', 'Class 2']

# Plot confusion matrix
plot_confusion_matrix(y_test, ensemble_predictions, class_labels)

# Generate classification report
print(classification_report(y_test, ensemble_predictions, target_names=class_labels))

Here’s how I did it:

Prediction Generation: I first used the predict methods of the TCN and LSTM models to generate predicted probabilities for each sample in the test dataset.
Weight Definition: I assigned equal weights of 0.5 to both the TCN and LSTM models. These weights determine how much influence each model has on the final ensemble prediction. I could adjust these weights based on each model’s performance or my confidence in their predictions.
Combination of Predictions: I then combined the predicted probabilities from the TCN and LSTM models using the defined weights. By calculating the weighted sum of probabilities, I created ensemble predictions that reflect the collective insights of both models.
Selection of Final Prediction: Finally, I selected the class with the highest weighted average probability for each sample as the final ensemble prediction. This means that for every data point, I chose the class that the combined models were most confident about.

After obtaining the ensemble predictions, I proceeded to analyze the model’s performance further. I plotted a confusion matrix to visually assess how well the model classified the test data and generated a classification report to delve deeper into its precision, recall, and other key metrics across different classes. These steps helped me evaluate the effectiveness of the ensemble model in accurately classifying the test data.

confusion matrix and classification report for Ensemble using TCN, LSTM Neural network models

# df_ens_test = df.copy() 

# df_ens = df_ens_test[len(X_train):]

# df_ens['ensemble_neural_scaled'] =  ensemble_predictions

# df_ens['ens'] = df_ens['ensemble_neural_scaled'].shift(1).dropna().astype(int)

# df_ens = df_ens.dropna()

# df_ens['ens']

df_ens_test = df.copy() 

df_ens = df_ens_test[len(X_train):]

df_ens['ensemble_neural_scaled'] =  ensemble_predictions

df_ens['ens'] = df_ens['ensemble_neural_scaled'].shift(1).dropna().astype(int)

df_ens = df_ens.dropna()

df_ens['ens']

df_ens = df_ens.reset_index(inplace=False)
df_ens['Date'] = pd.to_datetime(df_ens['Date'])
df_ens.set_index('Date', inplace=True)

def SIGNAL_0(df_ens):
    return df_ens['ens']

class MyCandlesStrat_0(Strategy):  
    def init(self):
        super().init()
        self.signal1_1 = self.I(SIGNAL_0, self.data)
    
    def next(self):
        super().next() 
        if self.signal1_1 == 1:
            sl_pct = 0.1  # 10% stop-loss
            tp_pct = 0.025  # 2.5% take-profit
            sl_price = self.data.Close[-1] * (1 - sl_pct)
            tp_price = self.data.Close[-1] * (1 + tp_pct)
            self.buy(sl=sl_price, tp=tp_price)
        elif self.signal1_1 == 2:
            sl_pct = 0.1  # 10% stop-loss
            tp_pct = 0.025  # 2.5% take-profit
            sl_price = self.data.Close[-1] * (1 + sl_pct)
            tp_price = self.data.Close[-1] * (1 - tp_pct)
            self.sell(sl=sl_price, tp=tp_price)

            
bt_0 = Backtest(df_ens, MyCandlesStrat_0, cash=100000, commission=.001)
stat_0 = bt_0.run()
stat_0

backtest results for ensemble of TCN, LSTM neural network models

get entire code and profitable algos @ https://patreon.com/pppicasso

The backtest results for the ensemble method utilizing TCN and LSTM neural networks are quite impressive. Over a period of 306 days and 22 hours, the strategy demonstrated remarkable performance.

Exposure Time [%]: The strategy was almost always invested in the market, with an exposure time percentage of 99.98%, indicating active trading throughout the duration.
Equity Final [$]: The final equity reached $208,771.662, signifying a substantial increase from the initial capital.
Return [%]: The strategy achieved an impressive return of 108.77%, showcasing its ability to generate profits over the testing period.
Return (Ann.) [%]: Annually, the strategy yielded a return of 136.76%, indicating its strong performance on an annualized basis.
Volatility (Ann.) [%]: The annualized volatility stood at 114.56%, indicating the level of risk associated with the strategy’s returns.
Sharpe Ratio: With a Sharpe ratio of 1.19, the strategy delivered solid risk-adjusted returns, indicating good performance relative to the level of risk taken.
Sortino Ratio: The Sortino ratio, measuring risk-adjusted returns based on downside volatility, was notably high at 5.66, indicating strong performance with a focus on minimizing downside risk.
Calmar Ratio: The Calmar ratio, which assesses risk-adjusted returns relative to maximum drawdown, was impressive at 5.33, reflecting a favorable balance between returns and drawdowns.
Max. Drawdown [%]: The maximum drawdown was relatively low at -25.65%, indicating limited downside risk compared to the potential returns.
# Trades: The strategy executed a total of 117 trades, indicating active trading activity throughout the testing period.
Win Rate [%]: With a win rate of 78.63%, the strategy demonstrated a high percentage of profitable trades.
Profit Factor: The profit factor, which measures the ratio of gross profits to gross losses, was 1.44, indicating that the strategy generated more profits than losses.
SQN: The System Quality Number (SQN), a measure of the quality of a trading system’s performance, was impressive at 2.04, indicating strong performance overall.

Overall, the backtest results suggest that the ensemble method utilizing TCN and LSTM neural networks has outperformed all previous individual models as well as the buy and hold strategy. The strategy exhibited robust performance, generating significant returns while effectively managing risk, making it a promising approach for trading in financial markets.

Running Best Neural Network Model on Entire dataset of Bitcoin (BTC) and Etherium (ETH) markets:


probs_tcn =  np.argmax(loaded_model_tcn.predict(X), axis=1)
probs_lstm =  np.argmax(loaded_model_lstm.predict(X), axis=1)

# Define weights for each model (you can adjust these based on performance)
weights = {
    # "transformer": 0.2,
    "tcn": 0.5,
    "lstm": 0.5
}

# Combine the predicted probabilities of all models with custom weights
ensemble_probs = (
    # probs_transformer * weights["transformer"] +
                  probs_tcn * weights["tcn"] +
                  probs_lstm * weights["lstm"])

# Select the class with the highest weighted average probability for each sample
ensemble_predictions = ensemble_probs

ensemble_predictions = ensemble_predictions.astype(int)

from sklearn.metrics import confusion_matrix, classification_report
import matplotlib.pyplot as plt
import seaborn as sns

# Save ensemble predictions
np.savetxt('ensemble_predictions.csv', ensemble_predictions, delimiter=',')

# Plot confusion matrix
def plot_confusion_matrix(y_true, y_pred, classes):
    cm = confusion_matrix(y_true, y_pred)
    plt.figure(figsize=(8, 6))
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=classes, yticklabels=classes)
    plt.xlabel('Predicted Label')
    plt.ylabel('True Label')
    plt.title('Confusion Matrix')
    plt.show()

# Define class labels (assuming 0, 1, 2 for classes)
class_labels = ['Class 0', 'Class 1', 'Class 2']

# Plot confusion matrix
plot_confusion_matrix(y, ensemble_predictions, class_labels)

# Generate classification report
print(classification_report(y, ensemble_predictions, target_names=class_labels))


# df_ens_test = df.copy() 

# df_ens = df_ens_test[len(X_train):]

# df_ens['ensemble_neural_scaled'] =  ensemble_predictions

# df_ens['ens'] = df_ens['ensemble_neural_scaled'].shift(1).dropna().astype(int)

# df_ens = df_ens.dropna()

# df_ens['ens']

df_ens = df.copy() 

# df_ens = df_ens_test[len(X_train):]

df_ens['ensemble_neural_scaled'] =  ensemble_predictions

df_ens['ens'] = df_ens['ensemble_neural_scaled'].shift(1).dropna().astype(int)

df_ens = df_ens.dropna()

df_ens['ens']


df_ens = df_ens.reset_index(inplace=False)
df_ens['Date'] = pd.to_datetime(df_ens['Date'])
df_ens.set_index('Date', inplace=True)

def SIGNAL_0(df_ens):
    return df_ens['ens']

class MyCandlesStrat_0(Strategy):  
    def init(self):
        super().init()
        self.signal1_1 = self.I(SIGNAL_0, self.data)
    
    def next(self):
        super().next() 
        if self.signal1_1 == 1:
            sl_pct = 0.1  # 10% stop-loss
            tp_pct = 0.025  # 2.5% take-profit
            sl_price = self.data.Close[-1] * (1 - sl_pct)
            tp_price = self.data.Close[-1] * (1 + tp_pct)
            self.buy(sl=sl_price, tp=tp_price)
        elif self.signal1_1 == 2:
            sl_pct = 0.1  # 10% stop-loss
            tp_pct = 0.025  # 2.5% take-profit
            sl_price = self.data.Close[-1] * (1 + sl_pct)
            tp_price = self.data.Close[-1] * (1 - tp_pct)
            self.sell(sl=sl_price, tp=tp_price)

            
bt_0 = Backtest(df_ens, MyCandlesStrat_0, cash=100000, commission=.001)
stat_0 = bt_0.run()
stat_0

confusion matrix and classification report for entire bitcoin dataset using ensemble method

backtest results for entire dataset of bitcoin using ensemble method

get entire code and profitable algos @ https://patreon.com/pppicasso

The backtest results for the ensemble method applied to the entire Bitcoin dataset are exceptionally impressive. Over a duration of 1023 days and 3 hours, the strategy achieved remarkable performance.

Exposure Time [%]: The strategy was virtually always invested in the market, with an exposure time percentage of 99.99%, indicating active trading throughout the duration.
Equity Final [$]: The final equity reached a staggering $4,854,305.679, indicating a substantial increase from the initial capital.
Return [%]: The strategy delivered an extraordinary return of 4754.31%, showcasing its exceptional ability to generate profits over the testing period.
Return (Ann.) [%]: Annually, the strategy yielded an impressive return of 289.41%, indicating its robust performance on an annualized basis.
Volatility (Ann.) [%]: The annualized volatility stood at 336.61%, reflecting the level of risk associated with the strategy’s returns.
Sharpe Ratio: Despite the high volatility, the strategy achieved a Sharpe ratio of 0.86, indicating reasonable risk-adjusted returns relative to the level of risk taken.
Sortino Ratio: The Sortino ratio, which evaluates risk-adjusted returns based on downside volatility, was exceptionally high at 7.07, indicating outstanding performance with a focus on minimizing downside risk.
Calmar Ratio: The Calmar ratio, measuring risk-adjusted returns relative to maximum drawdown, was impressive at 7.77, demonstrating a favorable balance between returns and drawdowns.
Max. Drawdown [%]: The maximum drawdown was relatively low at -37.27%, indicating limited downside risk compared to the potential returns.
# Trades: The strategy executed a total of 1154 trades, indicating highly active trading activity throughout the testing period.
Win Rate [%]: With a win rate of 81.46%, the strategy demonstrated a high percentage of profitable trades.
Profit Factor: The profit factor, which measures the ratio of gross profits to gross losses, was 1.23, indicating that the strategy generated more profits than losses.
SQN: The System Quality Number (SQN), a measure of the quality of a trading system’s performance, was solid at 1.98, indicating strong performance overall.

Overall, the backtest results suggest that the ensemble method applied to the entire Bitcoin dataset has outperformed all expectations, delivering exceptional profits while effectively managing risk. The strategy’s robust performance makes it a promising approach for trading cryptocurrencies, showcasing its potential for generating significant returns in the volatile cryptocurrency markets.

Import Model to new dataset and Run backtest (Etherium/ETH):

from keras.models import save_model, load_model
from tcn import TCN

# Define file paths for saving the models
lstm_model_path = 'lstm_model.h5'
tcn_model_path = 'tcn_model.h5'
transformer_model_path = 'transformer_model.h5'

# Load the LSTM model
loaded_model_lstm = load_model(lstm_model_path)

# Define a dictionary to specify custom objects
custom_objects = {'TCN': TCN}

# Load the TCN model with custom objects
loaded_model_tcn = load_model(tcn_model_path, custom_objects=custom_objects)


# Load the Transformer model
loaded_model_transformer = load_model(transformer_model_path)

# Assuming model_transformer, model_tcn, and model_lstm are already trained
# Use each model to predict probabilities for each sample
# probs_transformer =  np.argmax(model_transformer.predict(X_test), axis=2)
probs_tcn =  np.argmax(loaded_model_tcn.predict(X), axis=1)
probs_lstm =  np.argmax(loaded_model_lstm.predict(X), axis=1)

# Define weights for each model (you can adjust these based on performance)
weights = {
    # "transformer": 0.2,
    "tcn": 0.5,
    "lstm": 0.5
}

# Combine the predicted probabilities of all models with custom weights
ensemble_probs = (
    # probs_transformer * weights["transformer"] +
                  probs_tcn * weights["tcn"] +
                  probs_lstm * weights["lstm"])

# Select the class with the highest weighted average probability for each sample
ensemble_predictions = ensemble_probs

ensemble_predictions = ensemble_predictions.astype(int)

from sklearn.metrics import confusion_matrix, classification_report
import matplotlib.pyplot as plt
import seaborn as sns

# Save ensemble predictions
np.savetxt('ensemble_predictions.csv', ensemble_predictions, delimiter=',')

# Plot confusion matrix
def plot_confusion_matrix(y_true, y_pred, classes):
    cm = confusion_matrix(y_true, y_pred)
    plt.figure(figsize=(8, 6))
    sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=classes, yticklabels=classes)
    plt.xlabel('Predicted Label')
    plt.ylabel('True Label')
    plt.title('Confusion Matrix')
    plt.show()

# Define class labels (assuming 0, 1, 2 for classes)
class_labels = ['Class 0', 'Class 1', 'Class 2']

# Plot confusion matrix
plot_confusion_matrix(y, ensemble_predictions, class_labels)

# Generate classification report
print(classification_report(y, ensemble_predictions, target_names=class_labels))


# df_ens_test = df.copy() 

# df_ens = df_ens_test[len(X_train):]

# df_ens['ensemble_neural_scaled'] =  ensemble_predictions

# df_ens['ens'] = df_ens['ensemble_neural_scaled'].shift(1).dropna().astype(int)

# df_ens = df_ens.dropna()

# df_ens['ens']

df_ens = df.copy() 

# df_ens = df_ens_test[len(X_train):]

df_ens['ensemble_neural_scaled'] =  ensemble_predictions

df_ens['ens'] = df_ens['ensemble_neural_scaled'].shift(1).dropna().astype(int)

df_ens = df_ens.dropna()

df_ens['ens']


df_ens = df_ens.reset_index(inplace=False)
df_ens['Date'] = pd.to_datetime(df_ens['Date'])
df_ens.set_index('Date', inplace=True)

def SIGNAL_0(df_ens):
    return df_ens['ens']

class MyCandlesStrat_0(Strategy):  
    def init(self):
        super().init()
        self.signal1_1 = self.I(SIGNAL_0, self.data)
    
    def next(self):
        super().next() 
        if self.signal1_1 == 1:
            sl_pct = 0.1  # 10% stop-loss
            tp_pct = 0.025  # 2.5% take-profit
            sl_price = self.data.Close[-1] * (1 - sl_pct)
            tp_price = self.data.Close[-1] * (1 + tp_pct)
            self.buy(sl=sl_price, tp=tp_price)
        elif self.signal1_1 == 2:
            sl_pct = 0.1  # 10% stop-loss
            tp_pct = 0.025  # 2.5% take-profit
            sl_price = self.data.Close[-1] * (1 + sl_pct)
            tp_price = self.data.Close[-1] * (1 - tp_pct)
            self.sell(sl=sl_price, tp=tp_price)

            
bt_0 = Backtest(df_ens, MyCandlesStrat_0, cash=100000, commission=.001)
stat_0 = bt_0.run()
stat_0

The provided code aims to evaluate the performance of three pre-trained neural network models — LSTM, TCN, and Transformer — on a new dataset representing Ethereum. Here’s a breakdown of the steps taken:

Loading Pre-Trained Models: The code loads pre-trained models for LSTM, TCN, and Transformer architectures using the load_model function from Keras. Additionally, it defines custom objects required for loading TCN model due to the usage of custom layers.
Prediction Using Pre-Trained Models: Each loaded model predicts probabilities for each sample in the new dataset.
Ensemble Method: The predicted probabilities from the LSTM and TCN models are combined using weighted averages. The weights are specified in the weights dictionary.
Evaluation Metrics: The ensemble predictions are evaluated using confusion matrix and classification report metrics to assess the model’s performance on the new dataset.
Visualization: The confusion matrix is plotted to visualize the model’s classification performance.
Strategy Implementation: The ensemble predictions are applied to formulate a trading strategy. A buy or sell signal is generated based on the ensemble prediction, and appropriate stop-loss and take-profit levels are set.
Backtesting: The trading strategy is backtested on the new dataset representing Ethereum price data.
Result Analysis: The backtest results are analyzed to evaluate the performance of the ensemble trading strategy on the Ethereum dataset.

By following these steps, the code aims to demonstrate how an ensemble of multiple models can be leveraged to improve prediction accuracy and formulate effective trading strategies for cryptocurrency price data, specifically focusing on Ethereum.

confusion matrix and classification report for Etherium datset using ensemble method with LSTM, TCN neural network models

backtest results for Etherium datset using ensemble method with LSTM, TCN neural network models

get entire code and profitable algos @ https://patreon.com/pppicasso

The backtest results for the Ethereum dataset showcase remarkable performance, even though it was entirely unknown to the model. Here’s a breakdown of the key metrics:

Duration and Exposure: The backtest covers a duration of 1023 days, with the model being active for approximately 99.81% of the time.
Returns: The final equity shows a staggering return of 11272.93%, significantly outperforming the buy and hold return of 120.14%.
Annualized Return: The annualized return stands at an impressive 431.42%, indicating the growth rate of investment if it had been compounded annually.
Volatility: Despite the high returns, the volatility remains substantial at 795.97% annually, reflecting the fluctuation in the asset’s price over time.
Risk-Adjusted Measures: The Sharpe ratio, a measure of risk-adjusted return, is 0.542, suggesting a moderate level of risk compared to the returns. The Sortino ratio, which focuses on downside risk, is notably high at 7.25, indicating superior risk-adjusted returns.
Drawdowns: The maximum drawdown, representing the largest drop from a peak to a trough, is 53.38%, with an average drawdown of 4.89%. The longest drawdown lasted for 346 days.
Trading Activity: The backtest executed a total of 2288 trades, with a win rate of 80.55%. This indicates that the model made profitable trades in the majority of cases.
Trade Performance: The best trade yielded a return of 2.53%, while the worst trade resulted in a loss of 10.52%. On average, each trade generated a return of 0.25%.
Profit Factor and SQN: The profit factor, a measure of the profitability of winning trades compared to losing trades, is 1.23, suggesting that the model’s trading strategy was profitable overall. The SQN (System Quality Number) of 1.49 indicates that the trading system’s performance was above average.

Overall, the backtest results demonstrate the effectiveness of the model’s trading strategy on the Ethereum dataset, showcasing remarkable returns, a high win rate, and favorable risk-adjusted measures despite the inherent volatility of the cryptocurrency market.

Conclusion:

In conclusion, the ensemble method combining TCN and LSTM neural network models has demonstrated exceptional performance across various datasets, outperforming individual models and even surpassing buy and hold strategies. This underscores the effectiveness of ensemble learning in improving prediction accuracy and robustness.

Moving forward, there are several enhancements that can be implemented to further improve results:

Model Architecture: Experiment with different architectures, hyperparameters, and activation functions to enhance the models’ learning capabilities.
Feature Engineering: Explore additional features or engineered features that could provide valuable insights and improve prediction accuracy.
Ensemble Techniques: Implement more sophisticated ensemble techniques such as stacking, blending, or bagging to leverage the strengths of different models.
Advanced Optimization: Employ advanced optimization algorithms and techniques to fine-tune model parameters and improve convergence speed.
Regularization: Apply regularization techniques such as dropout, L2 regularization, or early stopping to prevent overfitting and improve generalization.

For readers interested in enhancing their skills in machine learning, artificial intelligence, and deep learning, here are some resources:

Books:

“Deep Learning” by Ian Goodfellow, Yoshua Bengio, and Aaron Courville.
“Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow” by Aurélien Géron.
“Pattern Recognition and Machine Learning” by Christopher M. Bishop.

Online Courses:

Coursera: Machine Learning by Andrew Ng.
Udacity: Deep Learning Nanodegree program.
edX: Artificial Intelligence courses offered by various universities.

Video Tutorials:

YouTube channels like “3Blue1Brown” for intuitive explanations of deep learning concepts.
TensorFlow and PyTorch official channels for tutorials and guides on using these frameworks.

By continuously learning and experimenting with new techniques and methodologies, one can stay at the forefront of machine learning and AI advancements.

These resources will provide a comprehensive foundation for understanding the technical aspects of algo trading and the application of Python in finance. Additionally, participating in online forums and communities such as Stack Overflow, GitHub, and Reddit’s r/algotrading can offer practical insights and peer support.

Finally, I’d like to express my gratitude for the opportunity to assist and provide guidance. Thank you for entrusting me with your queries, and I wish you success in your journey of mastering machine learning and artificial intelligence!

Thank you, Readers.

I hope you have found this article on Algorithmic strategy to be informative and helpful. As a creator, I am dedicated to providing valuable insights and analysis on cryptocurrency, stock market and other assets management.
If you have enjoyed this article and would like to support my ongoing efforts, I would be honored to have you as a member of my Patreon community. As a member, you will have access to exclusive content, early access to new analysis, and the opportunity to be a part of shaping the direction of my research.
Membership starts at just $10, and you can choose to contribute on a monthly basis. Your support will help me to continue to produce high-quality content and bring you the latest insights on financial analytics.
Patreon — https://patreon.com/pppicasso

Regards,

Puranam Pradeep Picasso

Linkedin — https://www.linkedin.com/in/puranampradeeppicasso/

Patreon — https://patreon.com/pppicasso

Facebook — https://www.facebook.com/puranam.p.picasso/

Twitter — https://twitter.com/picasso_999

Bitcoin/BTC 4750%+ , Etherium/ETH 11,270%+ profit in 1023 days using Neural Networks, Algorithmic Trading Vs/+ Machine Learning Models Vs/+ Deep Learning Model Part — 4 (TCN, LSTM, Transformer with Ensemble Method)

Introduction:

Our Algorithmic Trading Vs/+ Machine Learning Vs/+ Deep Learning Journey so far?

Stage 1:

“The 8787%+ ROI Algo Strategy Unveiled for Crypto Futures! Revolutionized With Famous RSI, MACD, Bollinger Bands, ADX, EMA” — Link

“Freqtrade Revealed: 7-Day Journey in Algorithmic Trading for Crypto Futures Market” — Link

Stage 2:

“How I achieved 3000+% Profit in Backtesting for Various Algorithmic Trading Bots and how you can do the same for your Trading Strategies — Using Python Code” — Link

Stage 3:

“Hyper Optimized Algorithmic Strategy Vs/+ Machine Learning Models Part -1 (K-Nearest Neighbors)” — Link

Stage 4:

“Hyper Optimized Algorithmic Strategy Vs/+ Machine Learning Models Part -2 (Hidden Markov Model — HMM)” — Link

Stage 5:

“Hyper Optimized Algorithmic Strategy Vs/+ Machine Learning Models Part -3 (XGBoost Classifier , LGBM Classifier, CatBoost Classifier, SVC, LSTM with XGB and Multi level Hyper-optimization)” — Link

Stage 6:

“From 54% to a Staggering 4648%: Catapulting Cryptocurrency Trading with CatBoost Classifier, Machine Learning Model at Its Best” — Link

The code Explanation:

Link for Pre-Processing and Feature Engineering (it is as same as mentioned in this article) — Link

Scaling and splitting the dataframe for training and testing:

Transformer model for Neural Networks with manual optimization:

Plotting Confusion-matrix and classification report for the transformer neural network model:

Backtest with test data for the transformer neural network model:

Party-Conclusion:

Hyper-Optimization, Finding confusion-matrix, classification report, then backtesting the results and save the model of Transformer Neural Network Model for Time-Series Data:

Party-Conclusion from above backtest results:

Save the model:

Manual Optimization, Finding Confusion-matrix, Classification Report, and then Backtesting the results of TCN (Temporal Convolutional Network) for Time-Series Data:

Manual Optimization, Finding Confusion-matrix, Classification Report, and then Backtesting the results of LSTM (Long Short-Term Memory) for Time-Series Data:

Saving multiple Models to local disk and Re-use the models for predicting other Assets in future or use for live Trading:

Ensemble Method for Multiple Models:

Running Best Neural Network Model on Entire dataset of Bitcoin (BTC) and Etherium (ETH) markets:

Conclusion:

Regards,

Puranam Pradeep Picasso

Written by Puranam Pradeep Picasso - ImbueDesk Profile

No responses yet