Categorical Embedding#
This section provides examples for using categorical embeddings with pytorch-tabnet2.
Categorical Embedding Example: Classification#
This guide demonstrates how to use categorical features with embeddings in TabNet for a classification task.
import numpy as np
from pytorch_tabnet.tab_model import TabNetClassifier
# Generate dummy data
X_train = np.random.randint(0, 5, size=(100, 3)) # 3 categorical features with 5 categories each
X_train = np.concatenate([
X_train,
np.random.rand(100, 7) # 7 continuous features
], axis=1).astype(np.float32)
y_train = np.random.randint(0, 2, size=(100,))
# Specify categorical feature indices and their dimensions
cat_idxs = [0, 1, 2] # indices of categorical columns
cat_dims = [5, 5, 5] # number of unique values for each categorical column
model = TabNetClassifier(cat_idxs=cat_idxs, cat_dims=cat_dims)
model.fit(X_train, y_train)
Categorical Embedding Example: Regression#
This guide demonstrates how to use categorical features with embeddings in TabNet for a regression task.
import numpy as np
from pytorch_tabnet.tab_model import TabNetRegressor
# Generate dummy data
X_train = np.random.randint(0, 4, size=(100, 2)) # 2 categorical features with 4 categories each
X_train = np.concatenate([
X_train,
np.random.rand(100, 8) # 8 continuous features
], axis=1).astype(np.float32)
y_train = np.random.rand(100)
# Specify categorical feature indices and their dimensions
cat_idxs = [0, 1] # indices of categorical columns
cat_dims = [4, 4] # number of unique values for each categorical column
# Reshape y_train to 2D as required by TabNetRegressor
y_train = y_train.reshape(-1, 1)
model = TabNetRegressor(cat_idxs=cat_idxs, cat_dims=cat_dims)
model.fit(X_train, y_train)
Note
When using categorical features, ensure that the categorical columns are integer-encoded (0 to N-1 for N categories).
More categorical embedding guides coming soon!