ANN 问题:我正在构建一个 ANN 模型,以根据某些特征预测新创业公司的利润

ANN problem: I am building an ANN model to predict the profit of a new startup based on certain features

The image of the dataset

import numpy as np
from keras.models import Sequential
from keras.layers import Dense

使用pandas作为数据帧格式加载数据集

import pandas as pd
df = pd.read_csv(r"E:_Startups.csv")
df.drop(['State'],axis = 1, inplace = True)

from sklearn.preprocessing import MinMaxScaler
mm = MinMaxScaler()
df.iloc[:,:] = mm.fit_transform(df.iloc[:,:])
info = df.describe()

x = df.iloc[:,:-1].values
y = df.iloc[:,-1].values

from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split( x,y, test_size=0.2, random_state=42)

正在初始化模型

model = Sequential()
model.add(Dense(40,input_dim =3,activation="relu",kernel_initializer='he_normal'))
model.add(Dense(30,activation="relu"))
model.add(Dense(1))
model.compile(loss="mean_squared_error",optimizer="adam",metrics=["accuracy"])

训练数据的拟合模型

model.fit(x=x_train,y=y_train,epochs=150, batch_size=32,verbose=1)

在测试数据上评估模型

eval_score_test = model.evaluate(x_test,y_test,verbose = 1)

我的准确度为零。

问题是准确性是离散值(分类)的度量标准。

你应该使用:

r2 得分 枫木 smape

相反。

例如:

model.compile(loss="mean_squared_error",optimizer="adam",metrics=["mean_absolute_percentage_error"])

添加@GuintherKovalski 的答案不是为了回归,但如果您仍然想使用它,那么您可以使用以下步骤将它与一些阈值一起使用:

  1. 设置一个阈值,如果预测值和实际值的绝对差小于等于阈值,则认为该值是正确的,否则为假。
  2. 例如 -> predicted values = [0.3, 0.7, 0.8, 0.2], original values = [0.2, 0.8, 0.5, 0.4]。 现在 abs diff -> [0.1, 0.1, 0.3, 0.2],让我们采用 0.2 的阈值。因此,使用此阈值 correct -> [1, 1, 0, 1],您的准确度将是 correct.sum()/len(correct),即 3/4 -> 0.75

这可以像这样在 TensorFlow 中实现

import numpy as np
import tensorflow as tf
from sklearn.datasets import make_regression

data = make_regression(10000)

model = tf.keras.Sequential([tf.keras.layers.Dense(1, input_shape=(100,))])

def custom_metric(a, b):
    threshold = 1 # Choose accordingly
    abs_diff = tf.abs(b - a)
    correct = abs_diff >= threshold
    correct = tf.cast(correct, dtype=tf.float16)
    res = tf.math.reduce_mean(correct)
    return res

model.compile('adam', 'mae', metrics=[custom_metric])
model.fit(data[0], data[1], epochs=30, batch_size=32)

只想对所有花宝贵时间帮助我的人说声谢谢。我正在发布这段代码,因为这对我有用。我希望它能帮助每个被困在某个地方寻找答案的人。我和朋友商量后得到了这个代码。

import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from keras.utils import np_utils
import pandas as pd
from sklearn.model_selection import train_test_split

# Loading the data set using pandas as data frame format 
startups = pd.read_csv(r"E:[=10=]Assignments\DL_assign_Startups.csv")
startups = startups.drop("State", axis =1)

train, test = train_test_split(startups, test_size = 0.2)

x_train = train.iloc[:,0:3].values.astype("float32")
x_test = test.iloc[:,0:3].values.astype("float32")
y_train = train.Profit.values.astype("float32")
y_test = test.Profit.values.astype("float32")

def norm_func(i):
     x = ((i-i.min())/(i.max()-i.min()))
     return (x)

x_train = norm_func(x_train)
x_test = norm_func(x_test)
y_train = norm_func(y_train)
y_test = norm_func(y_test)

# one hot encoding outputs for both train and test data sets 
y_train = np_utils.to_categorical(y_train)
y_test = np_utils.to_categorical(y_test)

# Storing the number of classes into the variable num_of_classes 
num_of_classes = y_test.shape[1]
x_train.shape
y_train.shape
x_test.shape
y_test.shape

# Creating a user defined function to return the model for which we are
# giving the input to train the ANN mode
def design_mlp():
    # Initializing the model 
    model = Sequential()
    model.add(Dense(500,input_dim =3,activation="relu"))
    model.add(Dense(200,activation="tanh"))
    model.add(Dense(100,activation="tanh"))
    model.add(Dense(50,activation="tanh"))
    model.add(Dense(num_of_classes,activation="linear"))
    model.compile(loss="mean_squared_error",optimizer="adam",metrics = 
    ["accuracy"])
    return model

# building a cnn model using train data set and validating on test data set
model = design_mlp()

# fitting model on train data
model.fit(x=x_train,y=y_train,batch_size=100,epochs=10)

# Evaluating the model on test data  
eval_score_test = model.evaluate(x_test,y_test,verbose = 1)
print ("Accuracy: %.3f%%" %(eval_score_test[1]*100)) 

# accuracy score on train data 
eval_score_train = model.evaluate(x_train,y_train,verbose=0)
print ("Accuracy: %.3f%%" %(eval_score_train[1]*100))