File:Traintest.svg

Original file ‎(SVG file, nominally 720 × 270 pixels, file size: 35 KB)

Summary

Description
English: Plots showing a training set and a test set from the same statistical population. Two curves are fit to the training set, one of which is an overfit. By plotting these curves with the test data, the overfitting can be seen.
Date
Source Own work
Author Skbkekas
Other versions

[edit]

SVG development
InfoField
 
The SVG code is valid.
 
This plot was created with Matplotlib.
Source code
InfoField

Python code

import numpy as np
import matplotlib.pyplot as plt

m = 0.2 ## mesh on the abscissa
s = 3 ## standard deviation of errors

def pdesign(X, d):
    """Generate a polynomial design matrix on X of order d."""
    V = X[:,np.newaxis]
    F = [V**k for k in range(d+1)]
    D = np.concatenate(F, axis=1)
    return D

def regfit(Y, D):
    """Regress Y on D using least squares."""
    U,S,Vt = np.linalg.svd(D,0)
    V = np.transpose(Vt)
    return np.dot(U, np.dot(np.transpose(U), Y))

X = np.arange(-2, 2, m, dtype=np.float64)

D1 = pdesign(X, 3)
D2 = pdesign(X, 13)

EY = X + X**3
Y1 = EY + np.random.normal(size=len(X))*s
Y2 = EY + np.random.normal(size=len(X))*s

Yhat1 = regfit(Y1, D1)
Yhat2 = regfit(Y1, D2)

plt.clf()
plt.figure(figsize=(8,3))
ax1 = plt.axes([0.06,0.1,0.4,0.8])
plt.title("Training set")
plt.plot(X, Y1, 'o')
plt.hold(True)
plt.plot(X, Yhat1, '-', color='green')
plt.plot(X, Yhat2, '-', color='orange')
ax1.set_ylim(-10, 10)
ax1.set_xticks([-2,-1,0,1,2])
ax2 = plt.axes([0.56,0.1,0.4,0.8])
plt.title("Test set")
plt.plot(X, Y2, 'o')
plt.plot(X, Yhat1, '-', color='green')
plt.plot(X, Yhat2, '-', color='orange')
ax2.set_xticks([-2,-1,0,1,2])
ax2.set_ylim(-10, 10)
plt.savefig("traintest.png")
plt.savefig("traintest.svg")

print ((Yhat1-Y1)**2).mean()
print ((Yhat2-Y1)**2).mean()

print ((Yhat1-Y2)**2).mean()
print ((Yhat2-Y2)**2).mean()

Licensing

I, the copyright holder of this work, hereby publish it under the following license:
w:en:Creative Commons
attribution
This file is licensed under the Creative Commons Attribution 3.0 Unported license.
You are free:
  • to share – to copy, distribute and transmit the work
  • to remix – to adapt the work
Under the following conditions:
  • attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

Captions

Add a one-line explanation of what this file represents

Items portrayed in this file

depicts

copyright status

copyrighted

copyright license

Creative Commons Attribution 3.0 Unported

inception

11 May 2009

source of file

original creation by uploader

media type

image/svg+xml

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current04:33, 12 May 2009720 × 270 (35 KB)Skbkekas{{Information |Description={{en|1=Plots showing a training set and a test set from the same statistical population. Two curves are fit to the training set, one of which is an overfit. By plotting these curves with the test data, the overfitting can be s
The following pages on the English Wikipedia use this file (pages on other projects are not listed):

Global file usage

The following other wikis use this file:

  • Usage on ar.wikipedia.org
    • بيانات التدريب والتحقق والاختبار
  • Usage on ca.wikipedia.org
    • Entrenament i validació de conjunts de dades
  • Usage on fa.wikipedia.org
    • آموزش، اعتبارسنجی و مجموعه‌های آزمایشی
  • Usage on fr.wikipedia.org
    • Jeux d'entrainement, de validation et de test
  • Usage on ja.wikipedia.org
    • 利用者:紅い目の女の子/データセット (機械学習)
    • 訓練・検証・テストデータセット
  • Usage on ko.wikipedia.org
    • 사용자:Soongki/연습장/프로젝트1
    • 훈련용 검증용 테스트 데이터 세트
  • Usage on ru.wikipedia.org
    • Обучающий, проверочный и тестовый наборы данных
  • Usage on sr.wikipedia.org
    • Обука, валидација и тестови
Retrieved from "https://en.wikipedia.org/wiki/File:Traintest.svg"