Bootstrapping in Statistics – JACK TRAINER (2024)

1 Comment / Stats / By Jack Trainer

Bootstrapping in Statistics – JACK TRAINER (1)

Introduction

It is usually impossible to measure every member of a population to work out properties of the population. Instead, we use randomly sampled data from a population andStatistical Inference to estimate these properties instead. Note: Population here doesn’t necessarily mean a group of people, it can mean a group of objects, amounts or events etc.

Suppose we have a large collection of coloured balls and we want to try and infer a statistic such as the mean value of the diameter of the balls using a sample of nine balls. If we took the mean of this sample and then took a new sample of nine balls we would expect that the mean values would be different. Furthermore, if we took many new samples we would begin to obtain a distribution of sample means known as the sampling distribution. The sampling distribution is an important thing to consider because it gives an idea of how likely the mean we get for a given sample is to occur given the population.

In reality we will only take a single sample mean but we can think of it as one of many sample means we could have got. In many cases the sampling distribution of the means is expected to be approximately shaped like a normal distribution with mean equal to the mean of the population. This is justified by the Central Limit Theorem.In this case it is possible to get an idea of how much the sample mean will differ from the true population mean on average using standard theoretical results to calculate a quantity known as the standard error.

But what can we do when the sampling distribution is not expected to look like a normal distribution as is the case for different statistics of interest such as the median or when we can’t assume that the sampling distribution will be normal. In this case we can use a technique known as Bootstrapping, where we resample the sampled data many times to generate a sampling distribution for a given statistic from which we can calculate a standard error for that statistic.

BOOTSTRAPPING

Lets go back to the example of the coloured balls to demonstrate how bootstrapping can be used to generate a sampling distribution for the median diameter of the balls. Imagine we have the original sampled dataset as a physical set of balls in a glass jar. To generate one bootstrap dataset imagine grabbing a ball out of the jar at random, noting it’s diameter and then putting it back. This process of selecting balls is repeated until you have noted down a set of balls that is the same size as the original sample. This set may contain repeated observations of the same ball because it was placed back into the jar between selection. This method of sampling is called sampling with replacement.

Bootstrapping in Statistics – JACK TRAINER (2)

Bootstrapping in Statistics – JACK TRAINER (3)

An example dataset generated by sampling with replacement is shown above. The next step is to calculate the statistic of interest for this dataset. In this case, the median of this sample is 9. Now comes the key part of bootstrapping. Repeating the resampling and calculating the statistic for “B” datasets. Here B is a placeholder for a large number. The choice of B is up to you but typically it is 10,000 or more. This may seem like a very large number but a computer can perform all the sampling and calculation in seconds. We can obtain a value for the standard error of the median by working out the standard deviation of the bootstrap samples known as the bootstrap standard error. By carrying out the bootstrapping procedure for the set of balls we can conclude that the bootstrap standard error of the median is 1.86.

Bootstrapping in Statistics – JACK TRAINER (4)

conclusion

Bootstrapping is an incredibly intuitive and powerful tool in statistics, but it is important to note that it is not generating new data out of the blue. The central assumption of bootstrapping is that the sampled data you work with is representative of the population of the whole. When this is true, we can resample the sampled data to get an idea of the range of different possible samples that could be obtained from the population to create a sampling distribution.

Keep in mind that bootstrapping is not just useful for calculating standard errors, it can also be used to construct confidence intervals and perform hypothesis testing. So, be sure to have bootstrapping techniques in mind when you are faced with data that doesn’t appear to be workable with traditional techniques.

Relevant Links

Tutorial on bootstrap hypothesis testing in R

Tutorial on bootstrap confidence intervals in R

Original bootstrap paper

1 thought on “Bootstrapping in Statistics”

Leave a Comment

Bootstrapping in Statistics – JACK TRAINER (2024)
Top Articles
Reporting PokéStop or Gym Issues — Pokémon GO Help Center
∞ Infinity Symbol Meaning in Modern Times
Rubratings Tampa
The Largest Banks - ​​How to Transfer Money With Only Card Number and CVV (2024)
Http://N14.Ultipro.com
Tyson Employee Paperless
craigslist: kenosha-racine jobs, apartments, for sale, services, community, and events
Craigslist Campers Greenville Sc
THE 10 BEST River Retreats for 2024/2025
Tribune Seymour
Bbc 5Live Schedule
Indiana Immediate Care.webpay.md
Fredericksburg Free Lance Star Obituaries
Evil Dead Rise Showtimes Near Regal Columbiana Grande
Nyuonsite
Mile Split Fl
Unlv Mid Semester Classes
Razor Edge Gotti Pitbull Price
Vipleaguenba
De beste uitvaartdiensten die goede rituele diensten aanbieden voor de laatste rituelen
Scotchlas Funeral Home Obituaries
north jersey garage & moving sales - craigslist
Team C Lakewood
The BEST Soft and Chewy Sugar Cookie Recipe
Teen Vogue Video Series
Rapv Springfield Ma
4 Methods to Fix “Vortex Mods Cannot Be Deployed” Issue - MiniTool Partition Wizard
2023 Ford Bronco Raptor for sale - Dallas, TX - craigslist
Shelby Star Jail Log
Hrconnect Kp Login
Pokemon Inflamed Red Cheats
49S Results Coral
Ff14 Laws Order
Jeep Cherokee For Sale By Owner Craigslist
Current Time In Maryland
Baldur's Gate 3 Dislocated Shoulder
Www.craigslist.com Syracuse Ny
EST to IST Converter - Time Zone Tool
Streameast.xy2
Koninklijk Theater Tuschinski
Complete List of Orange County Cities + Map (2024) — Orange County Insiders | Tips for locals & visitors
Appraisalport Com Dashboard Orders
Walmart 24 Hrs Pharmacy
Best Haircut Shop Near Me
Trending mods at Kenshi Nexus
American Bully Puppies for Sale | Lancaster Puppies
Random Warzone 2 Loadout Generator
Acuity Eye Group - La Quinta Photos
Fredatmcd.read.inkling.com
Round Yellow Adderall
Leslie's Pool Supply Redding California
Latest Posts
Article information

Author: Rob Wisoky

Last Updated:

Views: 5894

Rating: 4.8 / 5 (68 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Rob Wisoky

Birthday: 1994-09-30

Address: 5789 Michel Vista, West Domenic, OR 80464-9452

Phone: +97313824072371

Job: Education Orchestrator

Hobby: Lockpicking, Crocheting, Baton twirling, Video gaming, Jogging, Whittling, Model building

Introduction: My name is Rob Wisoky, I am a smiling, helpful, encouraging, zealous, energetic, faithful, fantastic person who loves writing and wants to share my knowledge and understanding with you.