Sample mean and sample covariance: Difference between revisions

Jump to navigation Jump to search
Brian Blank (talk | contribs)
No edit summary
 
WikiBot (talk | contribs)
m Robot: Automated text replacement (-{{SIB}} +, -{{EH}} +, -{{EJ}} +, -{{Editor Help}} +, -{{Editor Join}} +)
 
(One intermediate revision by one other user not shown)
Line 1: Line 1:
{{SI}}
{{SI}}


{{EH}}


==Overview==
==Overview==
Line 70: Line 67:
*[[Weighted mean]]
*[[Weighted mean]]


{{SIB}}
 
[[Category:Covariance and correlation]]
[[Category:Covariance and correlation]]
[[Category:Statistics]]
[[Category:Statistics]]

Latest revision as of 15:51, 20 August 2012

WikiDoc Resources for Sample mean and sample covariance

Articles

Most recent articles on Sample mean and sample covariance

Most cited articles on Sample mean and sample covariance

Review articles on Sample mean and sample covariance

Articles on Sample mean and sample covariance in N Eng J Med, Lancet, BMJ

Media

Powerpoint slides on Sample mean and sample covariance

Images of Sample mean and sample covariance

Photos of Sample mean and sample covariance

Podcasts & MP3s on Sample mean and sample covariance

Videos on Sample mean and sample covariance

Evidence Based Medicine

Cochrane Collaboration on Sample mean and sample covariance

Bandolier on Sample mean and sample covariance

TRIP on Sample mean and sample covariance

Clinical Trials

Ongoing Trials on Sample mean and sample covariance at Clinical Trials.gov

Trial results on Sample mean and sample covariance

Clinical Trials on Sample mean and sample covariance at Google

Guidelines / Policies / Govt

US National Guidelines Clearinghouse on Sample mean and sample covariance

NICE Guidance on Sample mean and sample covariance

NHS PRODIGY Guidance

FDA on Sample mean and sample covariance

CDC on Sample mean and sample covariance

Books

Books on Sample mean and sample covariance

News

Sample mean and sample covariance in the news

Be alerted to news on Sample mean and sample covariance

News trends on Sample mean and sample covariance

Commentary

Blogs on Sample mean and sample covariance

Definitions

Definitions of Sample mean and sample covariance

Patient Resources / Community

Patient resources on Sample mean and sample covariance

Discussion groups on Sample mean and sample covariance

Patient Handouts on Sample mean and sample covariance

Directions to Hospitals Treating Sample mean and sample covariance

Risk calculators and risk factors for Sample mean and sample covariance

Healthcare Provider Resources

Symptoms of Sample mean and sample covariance

Causes & Risk Factors for Sample mean and sample covariance

Diagnostic studies for Sample mean and sample covariance

Treatment of Sample mean and sample covariance

Continuing Medical Education (CME)

CME Programs on Sample mean and sample covariance

International

Sample mean and sample covariance en Espanol

Sample mean and sample covariance en Francais

Business

Sample mean and sample covariance in the Marketplace

Patents on Sample mean and sample covariance

Experimental / Informatics

List of terms related to Sample mean and sample covariance


Overview

Sample mean and sample covariance are statistics computed from a collection of data, thought of as being random.

Sample mean and covariance

Given a random sample <math>\textstyle \mathbf{x}_{1},\ldots,\mathbf{x}_{N}</math> from an <math>\textstyle n</math>-dimensional random variable <math>\textstyle \mathbf{X}</math> (i.e., (realizations of <math>\textstyle N</math> independent random variables with the same distribution as <math>\textstyle \mathbf{X}</math>), the sample mean is

<math> \mathbf{\bar{x}}=\frac{1}{N}\sum_{k=1}^{N}\mathbf{x}_{k}. </math>

In coordinates, writing the vectors as columns,

<math> \mathbf{x}_{k}=\left[ \begin{array} [c]{c}x_{1k}\\ \vdots\\ x_{nk}\end{array} \right] ,\quad\mathbf{\bar{x}}=\left[ \begin{array} [c]{c}\bar{x}_{1}\\ \vdots\\ \bar{x}_{n}\end{array} \right] , </math>

the entries of the sample mean are

<math> \bar{x}_{i}=\frac{1}{N}\sum_{k=1}^{N}x_{ik},\quad i=1,\ldots,n. </math>

The sample covariance of <math>\textstyle \mathbf{x}_{1},\ldots,\mathbf{x}_{N}</math> is the <math>\textstyle n</math> by <math>\textstyle n</math> matrix <math>\textstyle \mathbf{Q}=\left[ q_{ij}\right] </math> with the entries given by

<math> q_{ij}=\frac{1}{N-1}\sum_{k=1}^{N}\left( x_{ik}-\bar{x}_{i}\right) \left( x_{jk}-\bar{x}_{j}\right) </math>

The sample mean and the sample covariance matrix are unbiased estimates of the mean and the covariance matrix of the random variable <math>\textstyle \mathbf{X}</math>. The reason why the sample covariance matrix has <math>\textstyle N-1</math> in the denominator rather than <math>\textstyle N</math> is essentially that the mean is not known and is replaced by the sample mean <math>\textstyle\bar{x}</math>. If the mean <math>\textstyle\bar{X}</math> is known, the analogous unbiased estimate

<math> q_{ij}=\frac{1}{N}\sum_{k=1}^{N}\left( x_{ik}-\bar{X}_{i}\right) \left( x_{jk}-\bar{X}_{j}\right) </math>

with the exact mean indeed does have <math>\textstyle N</math>. This is an example why in probability and statistics it is essential to distinguish between upper case letters (random variables) and lower case letters (realizations of the random variables).

The maximum likelihood estimate of the covariance

<math> q_{ij}=\frac{1}{N}\sum_{k=1}^{N}\left( x_{ik}-\bar{x}_{i}\right) \left( x_{jk}-\bar{x}_{j}\right) </math>

for the Gaussian distribution case has <math>\textstyle N</math> as well. The difference of course diminishes for large <math>\textstyle N</math>.

Weighted samples

In a weighted sample, each vector <math>\textstyle \textbf{x}_{k}</math> is assigned a weight <math>\textstyle w_{k}\geq0</math>. Without loss of generality, assume that the weights are normalized:

<math> \sum_{k=1}^{N}w_{k}=1. </math>

(If they are not, divide the weights by their sum.) Then the weighted mean <math>\textstyle \mathbf{\bar{x}}</math> and the weighted covariance matrix <math>\textstyle \mathbf{Q}=\left[ q_{ij}\right] </math> are given by

<math> \mathbf{\bar{x}}=\sum_{k=1}^{N}w_{k}\mathbf{x}_{k} </math>

and [1]

<math> q_{ij}=\frac{\sum_{k=1}^{N}w_{k}\left( x_{ik}-\bar{x}_{i}\right) \left( x_{jk}-\bar{x}_{j}\right) }{1-\sum_{k=1}^{N}w_{k}^{2}}. </math>

If all weights are the same, <math>\textstyle w_{k}=1/N</math>, the weighted mean and covariance reduce to the sample mean and covariance above.

References

  1. Mark Galassi, Jim Davies, James Theiler, Brian Gough, Gerard Jungman, Michael Booth, and Fabrice Rossi. GNU Scientific Library - Reference manual, Version 1.9, 2007. Sec. 20.6 Weighted Samples

See also

Template:WikiDoc Sources