1
|
Ahmadini AAH, Danish F, Jan R, Rather AA, Raghav YS, Ali I. Unlocking the secrets of apple harvests: Advanced stratification techniques in the Himalayan region. Heliyon 2024; 10:e31693. [PMID: 38845877 PMCID: PMC11153167 DOI: 10.1016/j.heliyon.2024.e31693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Revised: 05/20/2024] [Accepted: 05/20/2024] [Indexed: 06/09/2024] Open
Abstract
This study focuses on standardizing sampling techniques and comparing various methods of sample allocation to effectively estimate apple area and production in the Himalayan region of India. We investigate different stratification tools,in formulating a sampling plan using information gathered from select orchardists in the locale during the 2016-17 period, it becomes essential to explore diverse methodologies to define the most suitable stratum boundaries, ascertain the requisite number of strata, and identify the optimal sample size. The stratification process, underpinned by the "Area under Apple" variable, which demonstrates a pronounced association with apple production, assumes a central role in this endeavor Several methods are utilized to construct strata, such as equalizing strata totals, cumulative equalization, equalization of ½{r(x) + f(x)} and equalization off ( x ) . We assess their efficiencies in estimating total apple production in the study district. The combination of the "Cumf ( x ) " of Neyman allocation demonstrates the lowest variance and the highest efficiency within a range of 2-4 strata, coupled with an increase in sample size from 10 to 40. Consequently, it can be inferred that the "Cumf ( x ) " method, particularly with L > 2, is preferable for estimating apple production in the Himalayan region of India.
Collapse
Affiliation(s)
- Abdullah Ali H. Ahmadini
- Department of Mathematics, College of Science, Jazan University, P.O. Box. 114, Jazan 45142, Kingdom of Saudi Arabia
| | - Faizan Danish
- Department of Mathematics, School of Advanced Sciences, Vellore Institute of Technology-Andhra Pradesh (VIT- AP) University, Inavolu, Beside AP Secretariat, Amaravati AP-522237, India
| | - Rafia Jan
- Department of Statistics, Govt. Degree College Bejbehara, Anantnag, J&K, India
| | - Aafaq A. Rather
- Symbiosis Statistical Institute, Symbiosis International (Deemed University), Pune, 411004, India
| | - Yashpal Singh Raghav
- Department of Mathematics, College of Science, Jazan University, P.O. Box. 114, Jazan 45142, Kingdom of Saudi Arabia
| | - Irfan Ali
- Department of Statistics and Operations Research, Aligarh Muslim University, Aligarh, 202002, India
| |
Collapse
|
2
|
Reddy KG, Khan M. Constructing efficient strata boundaries in stratified sampling using survey cost. Heliyon 2023; 9:e21407. [PMID: 37964820 PMCID: PMC10641212 DOI: 10.1016/j.heliyon.2023.e21407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 10/03/2023] [Accepted: 10/20/2023] [Indexed: 11/16/2023] Open
Abstract
For maximum precision in population parameter estimation under the Stratified sampling design, the optimum strata boundaries (OSB) could be constructed based on a continuous study variable rather than a set of categorical variables. If constructed optimally, the OSB results in homogenous units within each stratum leading to optimal stratum sample sizes (OSS) as well. The OSB and OSS may not remain optimum if the problem is considered in terms of a fixed total sample size, especially when a survey design involves a fixed budget. This article suggests a methodology for computing the OSB and OSS when the per unit stratum measurement costs for the survey or its probability density function are known. To plan for such a stratified survey, we demonstrate a design-based stratification empirically by using Wave 18 of the HILDA Survey general release dataset where we estimate the mean level of Gamma-distributed annual total disposable income in Australia, which could potentially be an important variable for policy decision-making. We also provide numerical illustrations for hypothetical study variables that follow exponential and right-triangular distributions respectively. The findings indicate that the suggested method is satisfactory in the sense that it is either more efficient or relatively comparable with other methods aimed at improving the accuracy of population parameter estimates. The proposed technique has been implemented in the updated stratifyR package.
Collapse
Affiliation(s)
- Karuna G. Reddy
- School of Information Technology, Engineering, Mathematics and Physics, The University of the South Pacific, Suva, Fiji
| | - M.G.M. Khan
- School of Information Technology, Engineering, Mathematics and Physics, The University of the South Pacific, Suva, Fiji
| |
Collapse
|
3
|
Optimum stratum boundaries and sample sizes for Covid-19 data in Egypt. PLoS One 2022; 17:e0271220. [PMID: 35900989 PMCID: PMC9333231 DOI: 10.1371/journal.pone.0271220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2021] [Accepted: 06/24/2022] [Indexed: 11/19/2022] Open
Abstract
Stratified random sampling is an effective sampling technique for estimating the population characteristics. The determination of strata boundaries and the allocation of sample size to the strata are two of the most critical factors in maximizing the precision of the estimates. Most surveys are conducted in an environment of severe budget constraints and a specific time is required to finish the survey. So cost and time are two important objectives that are taken under consideration in most surveys. The study suggested Mathematical goal programming model for determining optimum stratum boundaries for an exponential study variable under multiple objectives model when cost and time are under consideration. Compared to other techniques, Goal programming has many advantages in resources planning. Determining the required resources to satisfy the desired goals and the effectiveness of the available resources as well as providing best solutions under different amounts of resources are examples of the advantages of Goal programming. In addition the paper used data on Covid-19 to evaluate the performance of the suggested model for the exponential distribution. The study divided the number of new cases diseases into small, medium and high numbers. It also compared the results with the findings in the reports of the World Health Organization. The suggested mathematical goal programming revealed that Egypt was exposed to three waves of infection during the interval (5/3/2020 to 12/8/2021). These results are identical to the actual reality of covid-19 waves in Egypt.
Collapse
|
4
|
Amorim G, Tao R, Lotspeich S, Shaw PA, Lumley T, Shepherd BE. Two-Phase Sampling Designs for Data Validation in Settings with Covariate Measurement Error and Continuous Outcome. JOURNAL OF THE ROYAL STATISTICAL SOCIETY. SERIES A, (STATISTICS IN SOCIETY) 2021; 184:1368-1389. [PMID: 34975235 PMCID: PMC8715909 DOI: 10.1111/rssa.12689] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Measurement errors are present in many data collection procedures and can harm analyses by biasing estimates. To correct for measurement error, researchers often validate a subsample of records and then incorporate the information learned from this validation sample into estimation. In practice, the validation sample is often selected using simple random sampling (SRS). However, SRS leads to inefficient estimates because it ignores information on the error-prone variables, which can be highly correlated to the unknown truth. Applying and extending ideas from the two-phase sampling literature, we propose optimal and nearly-optimal designs for selecting the validation sample in the classical measurement-error framework. We target designs to improve the efficiency of model-based and design-based estimators, and show how the resulting designs compare to each other. Our results suggest that sampling schemes that extract more information from the error-prone data are substantially more efficient than SRS, for both design- and model-based estimators. The optimal procedure, however, depends on the analysis method, and can differ substantially. This is supported by theory and simulations. We illustrate the various designs using data from an HIV cohort study.
Collapse
Affiliation(s)
- Gustavo Amorim
- Department of Biostatistics, Vanderbilt University Medical Center, Nashvile, TN, USA
| | - Ran Tao
- Department of Biostatistics, Vanderbilt University Medical Center, Nashvile, TN, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Sarah Lotspeich
- Department of Biostatistics, Vanderbilt University Medical Center, Nashvile, TN, USA
| | - Pamela A. Shaw
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, PA, USA
| | - Thomas Lumley
- Department of Statistics, University of Auckland, Auckland, New Zealand
| | - Bryan E. Shepherd
- Department of Biostatistics, Vanderbilt University Medical Center, Nashvile, TN, USA
| |
Collapse
|
5
|
Wang J, Gao X, Chen J, Dai X, Tian S, Chen Y. An evaluation of observer monitoring program designs for Chinese tuna longline fisheries in the Pacific Ocean using computer simulations. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2021; 28:12628-12639. [PMID: 33085010 DOI: 10.1007/s11356-020-11266-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 10/14/2020] [Indexed: 06/11/2023]
Abstract
This paper evaluates the performance of different observer coverage rates and 9 possible sampling designs to estimate via computer simulation the total catch of target and non-target species for Chinese tuna longline fisheries in the Pacific Ocean. The stratified random samplings include different stratification schemes (based on target species or fishing areas) with different strategies for allocating observers. The observer data from 103 vessels between 2010 and 2019 were assumed to be the "true" sampling population. We concluded that the accuracy of catch estimates had a significant positive relationship with species detectability and observer coverage rate. On average, the accuracy improved by 50% when the coverage rate increases from 5 to 20%. Current simple random sampling in Chinese tuna longline fisheries is less efficient for monitoring many species. Stratified sampling designs based on the target species tended to yield the most accurate estimates of the total catch. Allocating the observers based on the scale of the fleets in different stratum seemed to be less efficient. The proportion of observers between different fleets should be adjusted according to different monitoring objectives. In general, a large proportion of observers are recommended to be allocated onboard vessels targeting bigeye tuna (Thunnus obesus). This study has the potential to have a significant contribution to future designs of the observer monitoring programs in Chinese tuna longline fishery and many other fisheries.
Collapse
Affiliation(s)
- Jiaqi Wang
- College of Marine Sciences, Shanghai Ocean University, 999 Huchenghuan Road, Shanghai, 201306, China
- School of Marine Sciences, University of Maine, Orono, ME, 04469, USA
| | - Xiaodi Gao
- College of Marine Sciences, Shanghai Ocean University, 999 Huchenghuan Road, Shanghai, 201306, China
| | - Jessica Chen
- School of Marine Sciences, University of Maine, Orono, ME, 04469, USA
| | - Xiaojie Dai
- College of Marine Sciences, Shanghai Ocean University, 999 Huchenghuan Road, Shanghai, 201306, China
- National Distant-Water Fisheries Engineering Research Center, Shanghai Ocean University, 999 Huchenghuan Road, Shanghai, 201306, China
- Key Laboratory of Sustainable Exploitation of Oceanic Fisheries Resources, Ministry of Education, Shanghai Ocean University, 999 Huchenghuan Road, Shanghai, 201306, China
| | - Siquan Tian
- College of Marine Sciences, Shanghai Ocean University, 999 Huchenghuan Road, Shanghai, 201306, China.
- National Distant-Water Fisheries Engineering Research Center, Shanghai Ocean University, 999 Huchenghuan Road, Shanghai, 201306, China.
- Key Laboratory of Sustainable Exploitation of Oceanic Fisheries Resources, Ministry of Education, Shanghai Ocean University, 999 Huchenghuan Road, Shanghai, 201306, China.
| | - Yong Chen
- School of Marine Sciences, University of Maine, Orono, ME, 04469, USA
| |
Collapse
|
6
|
Reddy KG, Khan MGM. stratifyR: An R Package for optimal stratification and sample allocation for univariate populations. AUST NZ J STAT 2020. [DOI: 10.1111/anzs.12301] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- K. G. Reddy
- Centre for Social Research & Methods Australian National University Canberra ACT2601Australia
| | - M. G. M. Khan
- School of Computing, Information & Mathematical Sciences The University of the South Pacific Suva Fiji
| |
Collapse
|
7
|
Comparison of sampling effort allocation strategies in a stratified random survey with multiple objectives. AQUACULTURE AND FISHERIES 2020. [DOI: 10.1016/j.aaf.2020.02.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|