Subjects

Subjects

More

Statistics notes

09/07/2023

403

10

1

Share

Save


HYPOTHESIS
-Statement.
-tested in investigation
P
STATISTICAL
ENQUIRY CYCLE
1- Planning = nypothesis, variables, recording data
2- Collectin

Register

Sign up to get unlimited access to thousands of study materials. It's free!

Access to all documents

Join milions of students

Improve your grades

By signing up you accept Terms of Service and Privacy Policy

HYPOTHESIS
-Statement.
-tested in investigation
P
STATISTICAL
ENQUIRY CYCLE
1- Planning = nypothesis, variables, recording data
2- Collectin

Register

Sign up to get unlimited access to thousands of study materials. It's free!

Access to all documents

Join milions of students

Improve your grades

By signing up you accept Terms of Service and Privacy Policy

HYPOTHESIS
-Statement.
-tested in investigation
P
STATISTICAL
ENQUIRY CYCLE
1- Planning = nypothesis, variables, recording data
2- Collectin

Register

Sign up to get unlimited access to thousands of study materials. It's free!

Access to all documents

Join milions of students

Improve your grades

By signing up you accept Terms of Service and Privacy Policy

HYPOTHESIS
-Statement.
-tested in investigation
P
STATISTICAL
ENQUIRY CYCLE
1- Planning = nypothesis, variables, recording data
2- Collectin

Register

Sign up to get unlimited access to thousands of study materials. It's free!

Access to all documents

Join milions of students

Improve your grades

By signing up you accept Terms of Service and Privacy Policy

HYPOTHESIS
-Statement.
-tested in investigation
P
STATISTICAL
ENQUIRY CYCLE
1- Planning = nypothesis, variables, recording data
2- Collectin

Register

Sign up to get unlimited access to thousands of study materials. It's free!

Access to all documents

Join milions of students

Improve your grades

By signing up you accept Terms of Service and Privacy Policy

HYPOTHESIS
-Statement.
-tested in investigation
P
STATISTICAL
ENQUIRY CYCLE
1- Planning = nypothesis, variables, recording data
2- Collectin

Register

Sign up to get unlimited access to thousands of study materials. It's free!

Access to all documents

Join milions of students

Improve your grades

By signing up you accept Terms of Service and Privacy Policy

HYPOTHESIS
-Statement.
-tested in investigation
P
STATISTICAL
ENQUIRY CYCLE
1- Planning = nypothesis, variables, recording data
2- Collectin

Register

Sign up to get unlimited access to thousands of study materials. It's free!

Access to all documents

Join milions of students

Improve your grades

By signing up you accept Terms of Service and Privacy Policy

HYPOTHESIS
-Statement.
-tested in investigation
P
STATISTICAL
ENQUIRY CYCLE
1- Planning = nypothesis, variables, recording data
2- Collectin

Register

Sign up to get unlimited access to thousands of study materials. It's free!

Access to all documents

Join milions of students

Improve your grades

By signing up you accept Terms of Service and Privacy Policy

HYPOTHESIS
-Statement.
-tested in investigation
P
STATISTICAL
ENQUIRY CYCLE
1- Planning = nypothesis, variables, recording data
2- Collectin

Register

Sign up to get unlimited access to thousands of study materials. It's free!

Access to all documents

Join milions of students

Improve your grades

By signing up you accept Terms of Service and Privacy Policy

HYPOTHESIS -Statement. -tested in investigation P STATISTICAL ENQUIRY CYCLE 1- Planning = nypothesis, variables, recording data 2- Collecting = data source, methods, constraints. Use 1- Clean 3- Processing | Presenting = diagrams, calculations, tech? 4-Interpreting = conclusion. 5- Communicating levaluating = target audience, evaluate. of Technology! A P Constraints ostime limits, cost, ethical issues, confidentiality, convenience. data (identify missing values, remove symbols, same format, remove outliers) 2- Order / sort data (categorise, remove extraneous columns (data) 3- automate (calcs-arg., diagrams-visual representation) Cypros: Saves time, reduces error, easier to correct mistakes, more visually appealing VARIABLES - Variable = anything that can Change •Univariate data •bivariate data DATA SOURCES Primary is explanatory ciandependent) : variable (s response Variable measured & don't have control of () extraneous (extra) : changed & you have control of. cdependent): any other variable -must be kept constant. • multivariate data quick Pros ER gather data directly related to mypothesis Pros cásy to find = = reliability cheap database often bigger 2 data collected yourself craw / un processed) Ceg. questionnaires, interviews, experiments, observations I be measured that 1 variable 2 variables 3+ Variables ✓ time- consuming Secondary data: already collected by someone else (processed) Ce.g. newspaper, magazine, the website, database, historical records, ONS] cons Source may be unreliable cons may have no access may (x-axis) (y-axis) 15- 147 dart expensive impractical link may not directly to hypothesis. CS CamScanner -DATA TYPES raw = un processed dara - qualitative = non-numerica! - quantitative I numerical >discrete = particular values Ce.g. no of L> continuous = any value on a scare ce.g. time, length, weight, heights Categorical scale-sorted into categories ce.g. (1 = blue, cz = brown) [qualitative Ordinal rank -> ranked tordered datace.g. marks in exam) [quantitative SIMULATION Simulation = a way to model random real could happen...

Can't find what you're looking for? Explore other subjects.

Knowunity is the #1 education app in five European countries

Knowunity has been named a featured story on Apple and has regularly topped the app store charts in the education category in Germany, Italy, Poland, Switzerland, and the United Kingdom. Join Knowunity today and help millions of students around the world.

Ranked #1 Education App

Download in

Google Play

Download in

App Store

Knowunity is the #1 education app in five European countries

4.9+

Average app rating

13 M

Pupils love Knowunity

#1

In education app charts in 11 countries

900 K+

Students have uploaded notes

Still not convinced? See what other students are saying...

iOS User

I love this app so much, I also use it daily. I recommend Knowunity to everyone!!! I went from a D to an A with it :D

Philip, iOS User

The app is very simple and well designed. So far I have always found everything I was looking for :D

Lena, iOS user

I love this app ❤️ I actually use it every time I study.

Alternative transcript:

(done random no.s) predict what & PROSI easier, cheaper, auicker, Practical proportions should match Steps: 1- choose random no. generator calice, calc, computer). 2- assign no s/outcomes to data in 3-generate random no. Youtcome. 9- match random no./outcome to match probabilities in a •results may not - repeat simulation for variance indication Creliability) EXAM Q: 1- Plan simulation based on a 2-Comment on suitability V use nos generated between range of assigned no.s. -RELIABILITY & •reliability = the a CONTROL GROUPS Control group = used to ensure experiment actually treatment is cause life ● · validity = (larger sompies = more Measures events to w/ Probabilities. accurately. repeat multiple times for variance. select Steps: 1- Use random selection to 2- experimental gr. is treated 3- Control gr. is NOT treated. u- all extraneous variables kept constant ( so only difference is explanatory variables S-compare results from both groups pairs Cooth W/ reliable data matched pairs= Sample members are both members in pair as similar as possible). One member from each pair is randomly. selected & assigned to control or experimetitalg PROS: reduces variability, increases reliability, more valid before and after = tests. same members before & after.. VALIDITY extent to which repeated measurements. give similar results. -repeat consistency. which an investigation ccuracy the extent. acc groups works CS CamScanner -TABULATIONS. - frequency table : 3 rows PRO: Show actual data values. exact. total frequency total L> easier to is used when there's Ef amt. Of coumn 1 = Exf x (₁) & " -grouped frequency tables have class intervals. Spot overall easier to read d Cur a PROS しつ - spot patterns & compare classes •Comparative Pie Charts • use smaller CI • Class limits = UB & LB of class • (w -LB. of gr -> Continuous: UB-LB of gr... discrete : CB of next gr. should overlap. •no • CI -> continuous: inequalities wi no gaps. discrete: hyphens wi gaps. is areas - C 4 rounding Cont. data can distort it. - two-way tables: summarise bivariate data. databases: real-life data using spreadsheet (for large amts of data). secondary data ceasly accessible (online). +0 data tally 3 しっ frequency. So calcs ce.g avg.) will be . Interpreting: • identify values / categones • describe generall trends. a lot of data = distribution CONS diagrams are misleading 10 53 accuracy of 5x44 values. for bunched data & larger (I for spread data COMPARATIVE PIE CHARTS • Pie chart = Univariate data, proportion comparisons, visually appealing, divide data into categories qualitative cusually) of their calls only estimates • calc. totals / differences / percents. • explain inconsistencies. different percents don't or values). errors the 2 (total of column is add to 100%. = rounding pie charts are Populations. (drawing Pie charts of samesice compare z data sets of used to when total frequencies are different. Proportional can be misleading - the same area is used for I unit of data area of pie chart = total frequency. √E₂ TF comparison: larger area= larger frequency Leven if proportions are same). CS CamScanner •SCATTER DIAGRAMS. used for bivariate data variables - explanatory cindependent / changing! - response c dependent i measuring = 6-axis. correlation: - Positive -negative. -no/zero = ·linear correlation as x 1,44 x + y + relationship. 3 as = Points close in Straight line •non-linear correlation = Points close in a curve Equation of COBF: -aka regression line to show relationship between 2 Causation. -Causation: when I variable CAUSES a change in another, Correlation doesn't imply causation () Correlation = only shows a link between the variables. ↳ there may be multiple factors that cause association LOBF: - Straight line, passes through most points, ignores outliers, have roughly same points on either side to Lost, the stronger the correlation beyond range of points. is the closer Pr, are Shouldn't extend draw (OBE through 'mean mean point csc, y) = Predicting values: -Using COBF -sreliable Interpolation: Within range of data given. Extrapolation: outside range of data given -Sunreliable cextend (OBF). ↳ trends may change. 4 don't use if correlation is weak Calculating regression line! SRCC US PMCC between -181; (loser to strong negative =JC-akis strong -> Strong or weak Positive y=ax + b a = gradient -> rate of increase of VR in relation to b = y-intercept - Value VR when VE is Of zero drawing regression line : -throun mean p+ point (mean of oc values 1 - -y-intercept mean of 4) values use gradient (plot co-ordinates). Ay > calc, gradient (triangle - Calc. y-intercept (entend COBAxor use b=y₁ -ax. ) zero = no correlation -linear & non-linear closer to 22/0, - curved = PMCC SRCC closer to ↳>PMCC => linear only, not for ranked dara VE or - Box PLOTS - show Q₁, Qz (median), Q3 & IQR & LISIUS snow shape of distribution/spread iskew each quartile = 25%. (median = 50%, IQR = middle Soy.) whiskers => Outside box compan'son median move / less on aug. ta morriless varied, Spread, consistent skewness & no skew, Positive or negative. CS CamScanner Skewness Positive = negative = medn mean >median 7 mode equal cw: unequal cw: ->mean mean <medians mode Symmetrical (no skew) = mean = median-mode HISTOGRAMS - continuous data from a grouped frequency table. shape of distribution - used for lots of data - snows fd reflects' concentration of -no gaps between bars Used for i Positive Skew modeemed <mean to left - MEASURES fd= Estimating frequencies: 1- find bars that cover the range of data in g 2- Calculate frequency. f = fdx(w 3- add frequencies Distributions To compare histograms, both must have same (I - describe distribution. mode t. A da Symmetrical negative Skew mode) med. I mean mode=mean = med. to right greater spread above 7 -x-axis = data; y-axis = f. x-axis = data; y-axis = fd flew OF comment greater spread below median ·avg. mode-> highest value median-> middle value CJ = (arithmetic) mean-s weighted mean-> geometric mean-> data. identifying (I w/ -estimate f. abovel below on distribution AUG = measure of central tendency. Pros -easy to use. - always a data val. -unaffected by extreme vais 1 Start amt in 76401 Linear transformation. I then I a reverse. Uniform Med.-mean (no mode). Sxin E W X VIEW ^ √ V₁ x V ₂ x.... Un cons. - may not be modelo -can't call read than highest/lowest f. certain value a median modal cross-> class wihighest frequency di n+112 C: nız. E foc / BEF F(x=mpl CI-class & FD Scales. Prosi -uses du data Bimodal 2 modes Taterel mean ·most representative used for skew SO need to have equal cw. -> data w/ diff. vals /weights. value from all, calc. mean, Cons: may not be a claterial. alwass CS CamScanner media Pros: - unaffected by extreme vous. - best for Skento data cail. Skew, I de. mean -> add > cons Changes to data AUG. mode ->add/remove dara=) changes mode / Bimodal conly it it changes which values appear most ) median->add greater /remove smaller vale => median add smaller /remore greater value => median & add/remove one greater AND One smaller value tho change = mean ↑ 9 reater iremore smaller value add smaller I remove greater valle vawes= change = meant replace -MEASURES OF SPREAD. -not always representative -may not be an delta val. Cdispersion) Range-> Spread; largest - smallest value IQR UQ-(Q Middle SO 7 (>LQ = n of of data dara Bu of data DUQ = -> discrete: ca UQ = 25% = tu (n+1) the volue Buch+') th value 75%. - Continuous / grouped: (Q = 114 Ath vame 3/4 value COMPARING DATA SETS - mean + SD or range -median + Ja R - mode + range IPR-> interpercentile range; difference between 2 percentiles. percentiles => divide data into 100 parts •I PR from table -> use linea interpolation 8 I PR = larger percentile smaller percentile IDR-> interdecile range; difference between 2 deciles cusually 1st & 9m - 90%) deciles => aftredivide data into lo parts. - 15+ decile IDR = 9th SD -> measure of • smaller SD = data close decile how far values are ·larger So = data for cs use mp for grouped data. or range to from from mean (spread I PR/IDR OUTLIERS data points that don't fit general pattern snow possible erros ce.g. in collectingimeasuring data) or they could be valid. values. - outliers can distort data method 1: <LQ- 1.5 IQR > UQ + 1.5 IQR mark outliers w/ a cross & draw whiskers to vales that 7 outliers. using both memads may give different values for ouniers forma Same date method ?: Outliers = outside: M +35 CS CamScanner . •WEIGHTED & INDEX Nor.. Index Numbers - Compare Price changes SIMPLE INDEX NUMBER - compares price change. -bas yr = 100), index no. = Weighted index no. 4 > 100 = value has increased 42100 = value nads decreased. INDEX NUMBER WEIGHTED - takes into account proportions Creflects importance of different (tems) = ISO Sample 250 means •QUALITY ASSURANCE -taking regular samples to same quality & Standard. the 250 150 Ĵ 786 • action limits -> = 350 lover time 1- regular samples taken. 2- Sample mean, median, range calculated. 3- Plot samplets Control charts. Control Chart of item w/ base or price on x Price base yr Price Σ index no. x weight Σ weights target line the mean warning limits → ±2 SD C (95%) * Conly s'n should fall outsich) (99.7%) x make sure all products are of 4100 (almost all X * should fall within). VAL Target (m) COL and Action limits. -between warning limits. -7 take another sample Cit sample also outside wl, stop production) above between belon action -> Stop production limits. I mean of population should = sample mean LS SO of population should be > sample 50. CAPTURE RE- CAPTURE - used to estimate large/moving populations. Lintime Assumptions: - Population hasn't changed (no births/deaths) - marks (tags hot lost + (& representative) sample size is large enough. population thoroughly mixed (- Prob. of being caught is equally likely for each). frame CS CamScanner PROBABILITY - Pcerent) = no. outcomes of total no. of - Probabilities add up to 1 Relative Frequency. - more trials = more accurate - relative frequency = Pcevent) xno. of thig is - helps spot bias. Sample space diagrams - sample space = list of all possible outcomes (in a table Conditional Probability PCBIA) event outcomes. = - P(ANB) P(A) Addition law - Mutually exclusive events (or): P(AUB) = P(A) + P(B) non-mutually exclusive exetevents: PCA(B) + P(ANB) = P(A) + P(B) exhaustive events at least one must happen cas it includes all possible events) Independent Events -When events have no effect on eachother PC An B) = P(A) * P(B) P (A) = P (AIB) -> as events don't depend on eachother Risk: • absolute riski probability • relative risk: how many times. happen to more likely it is to one group than another. = p (happening in group) PC happening in another group) Galways a number, not CS CamScanner NORMAL DISTRIBUTION Sonditions. -bell Shaped curve. - continuous data Symmetrical distribution M = mean mean-median = mode -larger SD = lower curve, smaller SD = higher curve. notation x~NCM, Jesen 6 86 TE रं no. of no, of SD from mean 344 341 No M+ 0 1030 SO 960 - 1000 15 oz th Sketching 1- Work out 2- xaxis draw draw 3- b-shaped curve drawing if drawing anto another curve : и- Smaller SD t heigner carte peak larger so lower Peak 6 6 01001 S = N=mean 0=50 6 = variance 3SDs either side M = line of symmetris from mean = 68% = = asy 2 SD 99.77 = 3 SO Value - mean SO e.g₁ mean = 1000, SD =is; calculate probability of data being between 960 e -2 = 2 I SD CM ±6) (M≤20 (M+30) = M & up to 350 1080 values So, data vies 2505 of mean which probability is asy. within CS CamScanner