In statistics and quantitative research methodology, a data sample is a set of data collected and/or selected from a statistical population by a defined procedure.^{[1]}
Typically, the population is very large, making a census or a complete enumeration of all the values in the population impractical or impossible. The sample usually represents a subset of manageable size. Samples are collected and statistics are calculated from the samples so that one can make inferences or extrapolations from the sample to the population. This process of collecting information from a sample is referred to as sampling. The data sample may be drawn from a population without replacement, in which case it is a subset of a population; or with replacement, in which case it is a multisubset.^{[2]}
Kinds of samples
A complete sample is a set of objects from a parent population that includes ALL such objects that satisfy a set of welldefined selection criteria.^{[3]} For example, a complete sample of Australian men taller than 2m would consist of a list of every Australian male taller than 2m. But it wouldn't include German males, or tall Australian females, or people shorter than 2m. So to compile such a complete sample requires a complete list of the parent population, including data on height, gender, and nationality for each member of that parent population. In the case of human populations, such a complete list is unlikely to exist, but such complete samples are often available in other disciplines, such as complete magnitudelimited samples of astronomical objects.
An unbiased (representative) sample is a set of objects chosen from a complete sample using a selection process that does not depend on the properties of the objects.^{[4]} For example, an unbiased sample of Australian men taller than 2m might consist of a randomly sampled subset of 1% of Australian males taller than 2m. But one chosen from the electoral register might not be unbiased since, for example, males aged under 18 will not be on the electoral register. In an astronomical context, an unbiased sample might consist of that fraction of a complete sample for which data are available, provided the data availability is not biased by individual source properties.
The best way to avoid a biased or unrepresentative sample is to select a random sample, also known as a probability sample. A random sample is defined as a sample where each individual member of the population has a known, nonzero chance of being selected as part of the sample.^{[5]} Several types of random samples are simple random samples, systematic samples, stratified random samples, and cluster random samples.
A sample that is not random is called a nonrandom sample or a nonprobability sampling.^{[6]} Some examples of nonrandom samples are convenience samples, judgment samples, purposive samples, quota samples, snowball samples, and quadrature nodes in quasiMonte Carlo methods.
Statistic samples have multiple uses. They can be used in many situations.
Mathematical description of random sample
In mathematical terms, given a random variable X with distribution F, a random sample of length n (where n may be any of 1,2,3,...) is a set of n independent, identically distributed (iid) random variables with distribution F.^{[7]}
A sample concretely represents n experiments in which the same quantity is measured. For example, if X represents the height of an individual and n individuals are measured, X_i will be the height of the ith individual. Note that a sample of random variables (i.e. a set of measurable functions) must not be confused with the realizations of these variables (which are the values that these random variables take, formally called random variates). In other words, X_i is a function representing the measurement at the ith experiment and x_i=X_i(\omega) is the value actually obtained when making the measurement.
The concept of a sample thus includes the process of how the data are obtained (that is, the random variables). This is necessary so that mathematical statements can be made about the sample and statistics computed from it, such as the sample mean and covariance.
See also
Notes

^ Peck, Roxy; Chris Olsen; Jay L. Devore (2008). Introduction to Statistics and Data Analysis (3 ed.). Cengage Learning.

^ Borzyszkowski, Andrzej M.; Sokołowski, Stefan, eds. (1993), Mathematical Foundations of Computer Science 1993. 18th International Symposium, MFCS'93 Gdańsk, Poland, August 30–September 3, 1993 Proceedings, Lecture Notes in Computer Science 711, pp. 281–290,

^ Pratt, J. W., Raiffa, H. and Schaifer, R. (1995). Introduction to Statistical Decision Theory. MIT Press, Cambridge,MA. MR1326829

^ Lomax, R. G. and HahsVaughan, Debbie L. An introduction to statistical concepts (3rd ed).

^

^

^ Samuel S. Wilks, Mathematical Statistics, John Wiley, 1962, Section 8.1
External links

Statistical Terms Made Simple
This article was sourced from Creative Commons AttributionShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, EGovernment Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a nonprofit organization.