정규모집단에서 다섯수치요약의 피셔 정보행렬에 관한 연구

이광진

doi:10.37727/jkdas.2022.24.3.955

메타분석 과정에서 보면 대부분의 개별연구들은 {표본평균, 표본표준편차; 표본크기} 형태의 정보를 제공하고 있다. 이는 모수에 대한 추정값과 이의 표준오차, 신뢰구간, 가설검정의 근거가 된다. 그러나 일부에서는 다섯수치요약인 {최소값, 제1사분위수, 중위수, 제3사분위수, 최대값; 표본크기}의 정보만 주어진 것들이 있다. 이런 경우 메타정보의 최대 활용을 위해서는 우선 정보의 통일화 작업이 선행되어야 한다. 대부분 {표본평균, 표본표준편차; 표본크기}의 형태로 통일시킨다. 이미 여러 선행연구들에서 정규모집단 가정 하에서 다섯수치요약 정보로부터 모평균과 모표준편차를 추정하는 여러 방법들이 제시되어 왔다. 이 방법들의 성능비교 연구들 중에서 최근 Lee(2022)는 최대우도추정법을 제안하며 그의 우수성을 보인 바 있지만 이 추정값의 표준오차의 특성들에 관한 연구들은 제시하지 못하였다. 이에 본 연구에서는 정규모집단 가정 하에서 다섯수치요약 정보에 근거한 최대우도추정값의 표본오차 추정을 위해 로그우도함수, score함수, 피셔 정보행렬을 대수적으로 먼저 유도하고, 이를 이용하여 R에서 쉽게 이들을 쉽게 계산하는 함수를 만들었다. 이를 이용한 모의실험을 통해 최대우도추정값의 표준오차 추정값의 특성을 파악하였다. 아울러 다섯수치요약 정보들의 관찰된 피셔 정보행렬들을 계산한 후 충분통계량인{표본평균, 표본표준편차; 표본크기}에 담긴 정보행렬들과 비교하고 상대효율을 살펴보았다.

In the meta-analysis, most individual research provide information in the form of {sample mean, sample standard deviation; sample size}. This is the basis for the estimate of the parameter, its standard error, confidence interval, and hypothesis test. However, in some cases, only ‘five-number summary’ information {minimum, first quartile, median, third quartile, maximum; sample size} is given. In this case, for maximum utilization of meta-information, the work of unifying information must be preceded. In most cases, it is transformed in the form {sample mean, sample standard deviation; sample size}. Several previous studies have already suggested several methods of estimating the population mean and population standard deviation from the five-number summary under the assumption of a normal population. In performance comparison studies of these methods, Lee(2022) recently proposed the maximum likelihood estimation method and showed the superiority of it, but studies on the characteristics of the standard error of the estimates were not presented. Therefore, in this study, in order to estimate the standard error of the maximum likelihood estimate(mle) based on the five-number summary, the log-likelihood function, the score function, and the Fisher information matrix(FIM) are first derived algebraically under the assumption of a normal population, and using these we create a function that computes them easily in R. Through a simulation using these, the characteristics of the standard error of the mle are identified. In addition, after estimating the FIM of the five-number summary, it was compared with the FIM of sufficient statistics and the relative efficiency was examined.

정규모집단에서 다섯수치요약의 피셔 정보행렬에 관한 연구
A Study on the Fisher Information Matrix of Five-Number Summary in Normal Population

(0)

(0)

(0)

(0)

정규모집단에서 다섯수치요약의 피셔 정보행렬에 관한 연구 A Study on the Fisher Information Matrix of Five-Number Summary in Normal Population

(0)

(0) 팝업 열기 팝업 닫기

(0)

(0)

정규모집단에서 다섯수치요약의 피셔 정보행렬에 관한 연구
A Study on the Fisher Information Matrix of Five-Number Summary in Normal Population

(0)