목록전체 글 (34)
statduck
Clustering is the method binding similar groups together. We can cluster customer type based on several variables about consumptions. In a mathematical expression, it means binding similar rows of a data matrix. For clustering, usually prototype methods are used. Prototype method assigns each observation to its closest prototype (centroid, medoid, etc.) "Closest" is defined by Euclidean distance..
https://www.intelligencelabs.tech/ 넥슨 인텔리전스랩스 테크블로그 안녕하세요! 넥슨 인텔리전스랩스의 공식 테크블로그입니다 🤗 본 블로그에서는 인텔리전스랩스가 발전시켜나가는 게임 분석 기술과 이를 기반으로 소통되어지는 게임과 플레이어의 변화를 www.intelligencelabs.tech 재미있는 이야기가 많이 들어있는 사이트를 발견. DL, ML이 게임 분석에 어떻게 적용될 지에 대한 사례들이 많이 기술되어있는데 흥미로운 것들이 많다. 욕설 필터링에 이용되는 자연어처리, 게임이용 중 유저의 표정 분석, 분석에 사용되는 설명가능한 부스팅 모형, 어뷰징 탐지를 위한 푸리에 활용 분석 등이 나와있다. 욕설 필터링이나 어뷰징 탐지 모두 실제 필터링 및 어뷰징 담당자에게 유용한 보조 도..
Kernel method Kernel method is a method to estimate the function using kernel. It estimates the value of a function by investigating data around x values. Neareness is defined by the distance, so a weight is given by the distance. KNN as a kernel method KNN can be viewed as a kernel method because it calculates distances in choosing nearest k points. $$ \hat{f}(x)=Ave(y_i|x_i \subset N_k(x)) $$ ..
Smoothing Splines ✏️ Smoothing Spline Avoiding the knot selection problem completely by using a maximal set of knots. $$ RSS(f,\lambda)=\sum^N_{i=1}{y_i-f(x_i)}^2+\lambda\int {f''(t)}^2dt $$ Our goal is to find the form of function minimizing RSS. The constrains mean curvature as follows: $$ r = (x,y), ||r'||=\sqrt{x'(s)^2+y'(s)^2}=1 \\ T(s)=(x'(s),y'(s)) = unit \; tangent \; vector \\ \kappa(s)..
Basis Expansions & Regularization We can't assure our function is linear. To deal with non-linear problem, we can use transformed X instead of original X. Basis Expansions and Regularization $$ f(X)=\sum^M_{m=1}\beta_mh_m(X) $$ The basis function, f(X), is linear on h even though $h(X)$ is non linear Form $h_m(X)=X_m$ Basic linear model $h_m(X)=X_j^2 \; or \; h_m(X)=X_jX_k$ Polynomial model $h_m..
Reference: https://www.w3schools.com/sql/default.asp This statement is for MySQL Syntax. 보조 열 선택 Select sub columns from the table SELECT col1, col2 FROM table_name; 모든 열 선택 Select all columns from the table (* means all) SELECT * FROM table_name; 중복 값 가지지 않도록 열 선택 Select sub columns who have distanct values SELECT DISTINCT col1, col2 FROM table_name; 조건 기반 필터링하기 Filter records based on a condit..
The basic assumption is that significant portion of the rows and columns of data matrix are highly correlated. Highly correlated data can be well explained by a low number of columns, so low-rank matrix is useful for the matrix estimation. It reduces the dimensionality by rotating the axis system, so that pairwise correlations between dimensions are removed. Low Rank Approach ✏️ $R$ has a rank $k
Reference: Charu C. Aggarwal. Recommender Systems: The Textbook. Springer Neighborhood-Based method 1. Introduction Motivation This algorithm assumes that similar users show similar patterns in rating. It can be categorized into to methods. User-based collaborative filtering(cf): The predicted ratings of user A are computed from the peer group ratings. Item-based collaborative filtering(cf): The..