Confusion over df.sample()

Mark17 · Jan-15-2021, 07:02 PM

Hi all,

Here's some code:

import matplotlib.pyplot as plt
import pandas as pd
import sys

count_a = 0
count_b = 0

die = pd.Series([2,4,6,8,10,12])
group = ['samp_a','samp_b']
while count_a < 1000 and count_b < 1000:
    samp_a = die.sample(10,replace=True)
 #   samp_b = die.sample(1)
    print(samp_a)
    print('samp_a type is {}'.format(type(samp_a)))
#    print(samp_b)
    sys.exit()

Here's some sample output:

0     2
2     6
5    12
3     8
3     8
5    12
2     6
4    10
0     2
4    10
dtype: int64
samp_a type is <class 'pandas.core.series.Series'>

With die only including 2, 4, 6, 8, 10, and 12, how am I getting odd numbers printed in the output?

Also, why am I getting samples of _two_ numbers each? How do I change this?

Mark17 · Jan-15-2021, 07:06 PM

Never mind! It's outputting rows of a dataframe, which in this case would include an index from 0 to 5. I didn't realize that.

**buran** · Jan-15-2021, 07:12 PM

you draw just sample_a and exit the script with sys.exit()

the first column is the index of the dice Series.

Confusion over df.sample()

User Panel Messages

Announcements