이론공부/머신러닝

데이터 전처리: standardization

넹넹선생님 2024. 4. 9. 07:14
728x90
반응형

- 데이터 평균:

- 데이터 표준 편차:

 

- 표준화:

 

실습:

from sklearn import preprocessing
import pandas as pd
import numpy as np
    
NBA_FILE_PATH = '../datasets/NBA_player_of_the_week.csv'
# 소수점 5번째 자리까지만 출력되도록 설정
pd.set_option('display.float_format', lambda x: '%.5f' % x)
    
nba_player_of_the_week_df = pd.read_csv(NBA_FILE_PATH)
height_weight_age_df = nba_player_of_the_week_df[['Height CM', 'Weight KG', 'Age']]

# 데이터를 standardize 함
scaler = preprocessing.StandardScaler()
standardized_data = scaler.fit_transform(height_weight_age_df)
    
standardized_df = pd.DataFrame(standardized_data, columns=['Height', 'Weight', 'Age'])

 

728x90
반응형