Introduction to Bootstrap plot - GeeksforGeeks (2024)

Last Updated : 24 Jan, 2023

Summarize

Comments

Improve

Before getting into Bootstrap plot, let us first understand what Bootstrapping (or Bootstrap sampling) is all about.

Bootstrap Sampling: It is a method in which we take a sample data repeatedly with replacement from a data set to estimate a population parameter. It is used to determine various parameters of a population.

A bootstrap plot is a graphical representation of the distribution of a statistic calculated from a sample of data. It is often used to visualize the variability and uncertainty of a statistic, such as the mean or standard deviation, by showing the distribution of the statistic over many bootstrapped samples of the data.

In a bootstrap plot, the x-axis represents the values of the statistic and the y-axis represents the frequency of those values. A line is plotted for each bootstrapped sample, with the height of the line indicating the frequency of the statistic’s value in that sample. The distribution of the lines represents the distribution of the statistic over the bootstrapped samples.

The bootstrap plot is a powerful tool for understanding the uncertainty in a statistic, especially when the underlying distribution of the data is unknown or complex. It can also be used to generate confidence intervals for a statistic and to compare the distributions of different statistics.

It is important to note that Bootstrap is a resampling technique which is used to estimate the uncertainty of a statistic from a sample, without making any assumptions about the underlying distribution of the data. It can be used to estimate standard errors, confidence intervals, and to perform hypothesis tests.

Bootstrap plot: It is a graphical method used to measure the uncertainty of any desired statistical characteristic of a population. It is an alternative to the confidence interval. (also a mathematical method used for calculation of a statistic).

Structure

  • x-axis: Subsample number.
  • y-axis: Computed value of the desired statistic for a given subsample.

Need for a Bootstrap plot:

Commonly, we can calculate the uncertainty of a statistic of a population mathematically, using confidence intervals. However, in many cases, the uncertainty formula that is derived is mathematically intractable. In such cases, we use the Bootstrap plot.

Suppose, we have 5000 people in a park, and we need to find the average weight of the whole population. It is not feasible to measure the weight of each individual and then take an average of that. This is where bootstrap sampling comes into the picture.

What we do is, we take groups of 5 people randomly from the population and find its mean. We do the same process say 8-10 times. This way, we get a good estimate of the average weight of the population more efficiently.

Intuition:

Let us consider an example and understanding how the Bootstrap plot makes it easier to obtain critical information from a large population. Say we have a sample data of 3000 randomly generated uniform numbers. We take out a sub-sample of 30 numbers and find its mean. We do this again for another random sub-sample and so on.

We plot a bootstrap plot of the above-acquired information and just by looking at it, we can easily give a good estimate about the mean of all the 3000 numbers. There is various other useful information one can get out of a bootstrap plot such as:

  • which sub-sample had the lowest variance, or
  • which sub-sample creates the narrowest confidence interval, etc.

Implementation:

Python

import pandas as pd

import numpy as np

s = pd.Series(np.random.uniform(size=500))

pd.plotting.bootstrap_plot(s)

Output

Introduction to Bootstrap plot - GeeksforGeeks (1)

Limitation

  1. The bootstrap plot gives an estimation of the required information from the population, not the exact values.
  2. It is highly dependent on the dataset given. It fails to give good results when a lot of subsets have repeated samples.
  3. The bootstrap plot becomes ineffective when we are obtaining information that is highly dependent on the tail values. [As shown in Fig 1]

Advantages of bootstrap:

  • It is a non-parametric method, which means it does not require any assumptions about the underlying distribution of the data.
  • It can be used to estimate standard errors and confidence intervals for a wide range of statistics.
  • It can be used to estimate the uncertainty of a statistic even when the sample size is small.
  • It can be used to perform hypothesis tests and compare the distributions of different statistics.
  • It is widely used in many fields such as statistics, finance, and machine learning

Disadvantages of bootstrap:

  • It can be computationally intensive, especially when working with large datasets.
  • It may not be appropriate for all types of data, such as highly skewed or heavy-tailed distributions.
  • It may not be appropriate for estimating the uncertainty of statistics that have very large variances.
  • It may not be appropriate for estimating the uncertainty of statistics that are not smooth or have very different variances.
  • It may not always be a good substitute for other statistical methods like asymptotic methods, when large sample sizes are available.


P

prakharr0y

Introduction to Bootstrap plot - GeeksforGeeks (2)

Improve

Next Article

Introduction to Seaborn - Python

Please Login to comment...

Introduction to Bootstrap plot - GeeksforGeeks (2024)
Top Articles
The 5 steps of the strategic planning process | Mural
The Web3 Revolution: Is Everyone Ready for Change?
Fernald Gun And Knife Show
How To Start a Consignment Shop in 12 Steps (2024) - Shopify
Pollen Count Centreville Va
Hotels Near 625 Smith Avenue Nashville Tn 37203
Satyaprem Ki Katha review: Kartik Aaryan, Kiara Advani shine in this pure love story on a sensitive subject
Unitedhealthcare Hwp
Crocodile Tears - Quest
Optimal Perks Rs3
Mylife Cvs Login
Mlifeinsider Okta
Mndot Road Closures
What’s the Difference Between Cash Flow and Profit?
Craigslist Jobs Phoenix
454 Cu In Liters
Buying risk?
David Turner Evangelist Net Worth
Craigslist Farm And Garden Cincinnati Ohio
Munich residents spend the most online for food
Craigslistodessa
Wnem Tv5 Obituaries
Hesburgh Library Catalog
Is Light Raid Hard
Gesichtspflege & Gesichtscreme
Winterset Rants And Raves
Babydepot Registry
Our Leadership
Martins Point Patient Portal
Housing Intranet Unt
Rays Salary Cap
Laveen Modern Dentistry And Orthodontics Laveen Village Az
Restaurants Near Calvary Cemetery
UPS Drop Off Location Finder
Plato's Closet Mansfield Ohio
Strange World Showtimes Near Atlas Cinemas Great Lakes Stadium 16
Www Violationinfo Com Login New Orleans
Edict Of Force Poe
Restored Republic December 9 2022
Robeson County Mugshots 2022
Felix Mallard Lpsg
Cal Poly 2027 College Confidential
Nba Props Covers
M Life Insider
Simnet Jwu
Kent And Pelczar Obituaries
Winta Zesu Net Worth
M&T Bank
How to Connect Jabra Earbuds to an iPhone | Decortweaks
Dicks Mear Me
Solving Quadratics All Methods Worksheet Answers
Joe Bartosik Ms
Latest Posts
Article information

Author: Domingo Moore

Last Updated:

Views: 6579

Rating: 4.2 / 5 (73 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Domingo Moore

Birthday: 1997-05-20

Address: 6485 Kohler Route, Antonioton, VT 77375-0299

Phone: +3213869077934

Job: Sales Analyst

Hobby: Kayaking, Roller skating, Cabaret, Rugby, Homebrewing, Creative writing, amateur radio

Introduction: My name is Domingo Moore, I am a attractive, gorgeous, funny, jolly, spotless, nice, fantastic person who loves writing and wants to share my knowledge and understanding with you.