Definitions of Statistics
Statistics has been defined by various statisticians.
‘Statistics is the science of counting’ -A. L .Bowley
‘Statistics is the science which deals with the collection, presentation, analysis and interpretation of numerical data’ - Croxton and Cowden
Wallist and Roberts defines statistics as “Statistics is a body of methods for making decisions in the face of uncertainty”
Ya-Lun-Chou slightly modifies Wallist and Roberts definition and come with the following definition : “Statistics is a method of decision making in the face of uncertainty on the basis of numerical data and calculated risk.”
It may be seen that most of the above definitions of statistics are restricted to numerical measurements of facts and figures of a state. But modern thinkers like Secrist defines statistics as
‘By statistics we mean the aggregate of facts affected to a marked extent by multiplicity of causes, numerically expressed, enumerated or estimated to reasonable standards of accuracy collected in a systematic manner for a predetermined purpose and placed in relation to each other’.
Among them, the definition by Croxton and Cowden is considered as the most preferable one due to its comprehensiveness. It is clear from this definition that statistics brings out the following characteristics.
Statistics deals with the aggregate of facts and figures. A single number cannot be called as statistics. For example, the weight of a person with 65kg is not statistics but the weights of a class of 60 persons is statistics, since they can be studied together and meaningful comparisons are made in relation to the other. This reminds us of Joseph Stalin’s well known quote, “One death is a tragedy; a million is a statistics.” Further the purpose for which the data is collected is to be made clear, otherwise the whole exercise will be futile. The data so collected must be in a systematic way and should not be haphazard.
Statistical data so collected should be affected by various factors at the same time. This will help the statistician to identify the factors that influence the statistics. For example, the sales of commodities in the market are affected by causes such as supply, demand, and import quality etc. Similarly, as mentioned earlier if a million deaths occur the policy makers will be immediately in action to find out the causes for these deaths to see that such events will not occur.
The statistical facts and figures are collected numerically for meaningful inference. For instance, the service provided by a telephone company may be classified as poor, average, good, very good and excellent. They are qualitative in nature and cannot be called statistics. They should be expressed numerically such as 0 to denote poor,1 for average, 2 for good, 4 to denote very good and 5 for excellent. Then this can be regarded as statistics and is suitable for analysis. The other types of quality characteristics such as honesty, beauty, intelligence, defective etc which cannot be measured numerically cannot be called statistics. They should be suitably expressed in the form of numbers so that they are called statistics.
The numerical data are collected by counting, measuring or by estimating. For example, to find out the number of patients admitted in a hospital, data is collected by actual counting or to find out the obesity of patients, data are collected by actual measurements on height and weight. In a large scale study like crop estimation, data are collected by estimation and using the powerful sampling techniques, because the actual counting may or may not be possible. Even if it is possible, the measurements involve more time and cost. The estimated figures may not be accurate and precise. However certain degree of accuracy has to be maintained for a meaningful analysis.
One of the main reasons for the collection of statistical data is for comparisons In order to make meaningful and valid comparisons, the data should be on the same characteristic as far as possible. For instance, we can compare the monthly savings of male employees to that of the female employees in a company. It is meaningless if we compare the heights of 20 year-old boys to the heights 20 year- old trees in a forest.
Having looked into various definitions given by different authors to the term statistics in different contexts it would be appropriate to define
“Statistics in the sense of data are numerical statements of facts capable of analysis and interpretation”.
“Statistics in the sense of science is the study of principles and methods used in the collection, presentation, analysis and interpretation of numerical data in any sphere of enquiry”.