# Quick-Start¶

## Installation Guide¶

### CPU version¶

Install deepctr package is through pip



## Getting started: 4 steps to DeepCTR¶

### Step 1: Import model¶

import pandas as pd
from sklearn.preprocessing import LabelEncoder, MinMaxScaler
from sklearn.model_selection import train_test_split
from deepctr.models import DeepFM
from deepctr import SingleFeat

sparse_features = ['C' + str(i) for i in range(1, 27)]
dense_features = ['I'+str(i) for i in range(1, 14)]

data[sparse_features] = data[sparse_features].fillna('-1', )
data[dense_features] = data[dense_features].fillna(0,)
target = ['label']


### Step 2: Simple preprocessing¶

Usually there are two simple way to encode the sparse categorical feature for embedding

• Label Encoding: map the features to integer value from 0 ~ len(#unique) - 1
• Hash Encoding: map the features to a fix range,like 0 ~ 9999

And for dense numerical features,they are usually discretized to buckets,here we use normalization.

for feat in sparse_features:
lbe = LabelEncoder()# or Hash
data[feat] = lbe.fit_transform(data[feat])
mms = MinMaxScaler(feature_range=(0,1))
data[dense_features] = mms.fit_transform(data[dense_features])


### Step 3: Generate feature config dict¶

Here, for sparse features, we transform them into dense vectors by embedding techniques. For dense numerical features, we add a dummy index like LIBFM. That is to say, all dense features under the same field share the same embedding vector. In some implementations, the dense feature is concatened to the input embedding vectors of the deep network, you can modify the code yourself.

sparse_feature_list = [SingleFeat(feat, data[feat].nunique())
for feat in sparse_features]
dense_feature_list = [SingleFeat(feat, 0)
for feat in dense_features]


### Step 4: Generate the training samples and train the model¶

There are two rules here that we must follow

• The sparse features are placed in front of the dense features.
• The order of the feature we fit into the model must be consistent with the order of the feature config list.
train, test = train_test_split(data, test_size=0.2)
train_model_input = [train[feat.name].values for feat in sparse_feature_list] + \
[train[feat.name].values for feat in dense_feature_list]
test_model_input = [test[feat.name].values for feat in sparse_feature_list] + \
[test[feat.name].values for feat in dense_feature_list]

model = DeepFM({"sparse": sparse_feature_list,
"dense": dense_feature_list}, final_activation='sigmoid')