.sig_test.split_test
in Boston Housing price regression datasetΒΆ
[1]:
import numpy as np
import tensorflow.keras as keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D
from tensorflow.python.keras import backend as K
import time
from sklearn.model_selection import train_test_split
from tensorflow.keras.optimizers import Adam, SGD
from dnn_inference.sig_test import split_test
import tensorflow as tf
Boston house prices dataset
Data Set Characteristics:
Number of Instances: 506
Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14) is usually the target.
Attribute Information (in order)
- CRIM per capita crime rate by town
- ZN proportion of residential land zoned for lots over 25,000 sq.ft.
- INDUS proportion of non-retail business acres per town
- CHAS Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
- NOX nitric oxides concentration (parts per 10 million)
- RM average number of rooms per dwelling
- AGE proportion of owner-occupied units built prior to 1940
- DIS weighted distances to five Boston employment centres
- RAD index of accessibility to radial highways
- TAX full-value property-tax rate per USD10,000
- PTRATIO pupil-teacher ratio by town
- B 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town
- LSTAT % lower status of the population
- MEDV Median value of owner-occupied homes in $1000's
[2]:
(x_train, y_train), (_, _) = tf.keras.datasets.boston_housing.load_data(path="boston_housing.npz",
test_split=0.1)
y_train, y_test = y_train[:,np.newaxis], y_test[:,np.newaxis]
from sklearn import preprocessing
scaler = preprocessing.MinMaxScaler()
x_train = scaler.fit_transform(x_train)
[3]:
n, d = x_train.shape
print('num of samples: %d, dim: %d' %(n, d))
num of samples: 455, dim: 13
[4]:
from tensorflow import keras
from tensorflow.keras import layers
def build_model():
model = keras.Sequential([
layers.Dense(8, activation='relu', input_shape=[d]),
layers.BatchNormalization(),
layers.Dense(1, activation='relu')
])
optimizer = tf.keras.optimizers.Adam(1e-3)
model.compile(loss='mse',
optimizer=optimizer,
metrics=['mae', 'mse'])
return model
[5]:
model_null, model_alter = build_model(), build_model()
2022-06-30 12:53:52.290176: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-30 12:53:52.295414: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-30 12:53:52.295682: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-30 12:53:52.296167: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-06-30 12:53:52.296992: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-30 12:53:52.297257: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-30 12:53:52.297500: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-30 12:53:52.641227: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-30 12:53:52.641520: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-30 12:53:52.641753: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-06-30 12:53:52.642007: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3022 MB memory: -> device: 0, name: NVIDIA GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5
[6]:
from tensorflow.keras.callbacks import EarlyStopping
es = EarlyStopping(monitor='val_loss', mode='min',
verbose=0, patience=300, restore_best_weights=True)
fit_params = {'callbacks': [es],
'epochs': 3000,
'batch_size': 32,
'validation_split': .2,
'verbose': 0}
## testing params
test_params = { 'split': "one-split",
'inf_ratio': None,
'perturb': None,
'cv_num': 2,
'cp': 'hommel',
'verbose': 2}
## tuning params
tune_params = { 'num_perm': 100,
'ratio_grid': [.2, .4, .6, .8],
'if_reverse': 1,
'perturb_range': 2.**np.arange(-3,3,.3),
'tune_ratio_method': 'fuse',
'tune_pb_method': 'fuse',
'cv_num': 2,
'cp': 'hommel',
'verbose': 2}
[7]:
inf_feats = [np.arange(3), np.arange(5,11)]
cue = split_test(inf_feats=inf_feats, model_null=model_null, model_alter=model_alter, eva_metric='mse')
P_value = cue.testing(x_train, y_train, fit_params, test_params, tune_params)
INFO:tensorflow:Assets written to: ./saved/split_test/06-30_12-53/model_null_init/assets
2022-06-30 12:53:59.491087: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
INFO:tensorflow:Assets written to: ./saved/split_test/06-30_12-53/model_alter_init/assets
==================== one-split test for 0-th Hypothesis ====================
(tuneHP: ratio) Est. Type 1 error: 0.150; inf sample ratio: 0.800
(tuneHP: ratio) Est. Type 1 error: 0.270; inf sample ratio: 0.600
(tuneHP: ratio) Est. Type 1 error: 0.000; inf sample ratio: 0.400
β
(tuneHP: ratio) Done with inf sample ratio: 0.400
(tuneHP: pb) Est. Type 1 error: 0.000; perturbation level: 0.125
β
(tuneHP: pb) Done with inf pb level: 0.125
cv: 0; p_value: 0.03247; loss_null: 14.63722(51.76234); loss_alter: 18.41512(42.01610)
cv: 1; p_value: 0.01776; loss_null: 14.65350(46.57168); loss_alter: 21.16354(51.60448)
π§ͺ 0-th Hypothesis: reject H0 with p_value: 0.049
==================== one-split test for 1-th Hypothesis ====================
(tuneHP: ratio) Est. Type 1 error: 1.000; inf sample ratio: 0.800
(tuneHP: ratio) Est. Type 1 error: 0.000; inf sample ratio: 0.600
β
(tuneHP: ratio) Done with inf sample ratio: 0.600
(tuneHP: pb) Est. Type 1 error: 0.700; perturbation level: 0.125
(tuneHP: pb) Est. Type 1 error: 0.730; perturbation level: 0.154
(tuneHP: pb) Est. Type 1 error: 0.700; perturbation level: 0.189
(tuneHP: pb) Est. Type 1 error: 0.700; perturbation level: 0.233
(tuneHP: pb) Est. Type 1 error: 0.610; perturbation level: 0.287
(tuneHP: pb) Est. Type 1 error: 0.580; perturbation level: 0.354
(tuneHP: pb) Est. Type 1 error: 0.470; perturbation level: 0.435
(tuneHP: pb) Est. Type 1 error: 0.460; perturbation level: 0.536
(tuneHP: pb) Est. Type 1 error: 0.390; perturbation level: 0.660
(tuneHP: pb) Est. Type 1 error: 0.290; perturbation level: 0.812
(tuneHP: pb) Est. Type 1 error: 0.180; perturbation level: 1.000
(tuneHP: pb) Est. Type 1 error: 0.190; perturbation level: 1.231
(tuneHP: pb) Est. Type 1 error: 0.170; perturbation level: 1.516
(tuneHP: pb) Est. Type 1 error: 0.070; perturbation level: 1.866
(tuneHP: pb) Est. Type 1 error: 0.070; perturbation level: 2.297
(tuneHP: pb) Est. Type 1 error: 0.030; perturbation level: 2.828
β
(tuneHP: pb) Done with inf pb level: 2.828
cv: 0; p_value: 0.12088; loss_null: 31.87867(120.14606); loss_alter: 34.30483(86.49743)
cv: 1; p_value: 0.83584; loss_null: 33.45124(124.96815); loss_alter: 33.72868(86.60861)
π§ͺ 1-th Hypothesis: accept H0 with p_value: 0.363
[9]:
P_value
[9]:
[0.048711537430423474, 0.3626513098329711]