
PsychAss: Psychometric Properties & Principles [30%]
Quiz by Gerard Dimaano
Tag the questions with any skills you have. Your dashboard will track each student's mastery of each skill.
During a government scholarship screening, students take a standardized math and logic exam in a gymnasium. The proctors follow a strict manual and are interchangeable. The goal is to rank applicants by numerical scores. What best describes this situation?
A clinical psychologist reviews a client’s case history, interviews their family, administers cognitive tests, and observes their behavior before concluding the client has PTSD. What process is most clearly being used?
While assessing a child with suspected ADHD, a school psychologist collects data at home and school, focusing on how inattention appears as it happens in each setting. What type of assessment is this?
A BPO company wants to promote employees who can handle pressure. They administer a cognitive speed test scored by a software, ranking them by their raw results. What’s this process?
Which of the following best distinguishes psychological assessment from testing?
You’re asked to evaluate the cognitive functioning of a stroke patient remotely using a tool administered through a secure video platform. What type of approach is this?
A psychometrician is designing a test for anxiety. She identifies content areas, drafts multiple-choice items, and defines how each response will be scored. What is she defining?
In a dynamic assessment setup, a child is first tested for reading comprehension, then given a brief intervention, and finally tested again. Which principle is being demonstrated?
A test for promotion includes a cut-off score. Applicants who score below 75 fail. What element of the test does the 75 represent?
Celine is asked to administer a standardized test to 50 students, using an answer key and score sheet. She doesn’t interpret results. Which best describes her role?
A public high school uses a standardized exam to assess how well students have mastered Grade 10 science topics. The test contains fact-based and conceptual questions based on the curriculum. What type of test is this?
A college entrance exam measures how well students are likely to succeed in first-year college, focusing on reasoning and informal learning ability. What type of test is this?
An HR manager gives a test where applicants must answer as many questions as possible in 5 minutes. The questions are easy, but the time limit is strict. What kind of test is this?
A guidance counselor conducts a session where a client is free to talk about feelings with minimal interruption. The counselor avoids leading or judging. What interview style is being used?
In a test for bank tellers, applicants are asked to pretend a customer is angry about a transaction error. They must respond as if they were in that situation. What tool is this?
A forensic psychologist wants to review the background of a suspect. She uses school records, arrest reports, and psychiatric files. What data source is she relying on?
An artist is applying to a top design school. Instead of taking a written test, they submit a folder containing their best illustrations and photography. What assessment tool is used?
Which of the following is a test without right or wrong answers that measures typical ways of behaving, feeling, and thinking?
What kind of test is used to assess an individual’s general potential to reason, solve problems, and adapt—especially in unfamiliar situations?
A clinical psychologist administers a personality tool where ambiguous inkblot images are presented, and the client must describe what they see. This response is interpreted for unconscious themes. What is this?
A psychologist begins an assessment by identifying why the client was referred—whether it’s for school placement, therapy, or disability benefits. What is this step in the assessment process?
A researcher uses a computer algorithm that generates assessment results based on statistical rules and probabilities. What approach to interpretation is being used?
In a hiring assessment, a test predicts which candidates are likely to succeed based on patterns of past employee success. What concept is this related to?
While administering a test, the assessor notices the client continuously checks their phone despite instructions. This behavior isn't scored but may be significant. What is being observed?
Which assumption supports the idea that 'intelligence,' although not directly seen, can still be evaluated through observable behaviors and measurements?
An assessment involves gathering data through interviews, observation, and formal tests to build a psychological profile that includes a patient’s developmental history. What level of interpretation is this?
A test developer argues that a personality trait like introversion is relatively stable and can be used to explain behavioral patterns across situations. What assumption is being supported?
A large-scale civil service exam includes no interpretation of inner traits. The focus is only on scores and how they compare to others. What interpretive level is applied here?
An assessment uses multiple subtests to evaluate reasoning, memory, and verbal skills for a single purpose—determining learning disability eligibility. What is this called?
A guidance counselor understands that some results in a student's test may be influenced by test anxiety or poor sleep, and adjusts interpretation accordingly. Which assumption is being honored?
A psychologist administers a personality test to a group and repeats it two weeks later. The scores are almost identical. Which type of reliability is being demonstrated?
A student scores high on a logic test in a quiet room but scores much lower when retested during a thunderstorm. What source of error does this best illustrate?
Which of the following statements correctly reflects Classical Test Theory (CTT)?
When a test taker remembers specific items from a previous test administration, leading to improved scores, which effect is at play?
A test is made more reliable by increasing the number of well-written items. What principle supports this?
A psychological test consistently overestimates a test taker’s score by 5 points. What type of error is this?
Which of the following inflates reliability coefficients due to memory effects when test-retest intervals are too short?
What does the reliability coefficient represent?
Which of the following situations likely reflects a low reliability in a test?
In a newly developed test, some items are vague and unrelated to the construct being measured. What source of error variance does this indicate?
A researcher uses two versions of a math test (Form A and Form B) to a group of students one week apart and finds a strong correlation between scores. Which form of reliability is demonstrated?
Which situation BEST illustrates a threat to reliability due to item sampling error?
In a speed test where all items have equal difficulty, which internal consistency method is MOST appropriate?
Which is the MOST suitable method to assess the internal consistency of a test with Likert-type scales?
To avoid carryover effects when using two forms of a test, a teacher gives Form A to one group first and Form B to another group first. What technique is used?
A test given only once shows high correlation between odd-numbered and even-numbered item sets. Which reliability method is being used?
What is the main statistical tool used to estimate reliability coefficients in test-retest, split-half, and alternate forms?
A psychologist uses Fleiss’ Kappa to determine the agreement among three interviewers classifying clients by diagnosis. What type of reliability is assessed?
Which of the following causes an inflated reliability coefficient in test-retest designs with a very short interval?
A test appears unreliable because the raters’ scores vary wildly despite the same observed behavior. What type of error is present?
An intelligence test was administered twice, six months apart. The correlation between the scores is used to assess which of the following?
Which BEST illustrates the practice effect in a test-retest design?
A test is said to have systematic error if:
Which of the following measures internal consistency by focusing on the average distance between item scores?
A psychometrician computes the difference in variances between odd- and even-numbered items, then divides it by the variance of the total score. What reliability formula is this?
A test has low reliability. The developer plans to lengthen the test to meet the desired reliability level. Which formula helps estimate how many more items are needed?
An examinee scores higher than their true ability due to luck, such as guessing correctly. What kind of measurement error occurred?
Which BEST differentiates homogeneity from heterogeneity in psychological tests?
When do you use KR-20 instead of KR-21?
Which reliability estimate is MOST appropriate for evaluating inter-rater consistency with ordinal or nominal data from multiple observers?
A psychologist is evaluating a test for entry-level police applicants. The test seems to have a high degree of internal consistency and the questions cover all the required knowledge domains. However, applicants report that the test doesn't feel like it's relevant to the actual job. Which type of validity is not being satisfied here?
A test that includes unrelated content—like questions about weather patterns in a test of reading comprehension—suffers from:
What is most likely being assessed when a panel of experts rates test items based on how well they match learning objectives?
What type of validity evidence is shown when a new anxiety test positively correlates with the Beck Anxiety Inventory, a widely validated measure?
A researcher uses the method of contrasted groups to validate a depression inventory. Which result would BEST support construct validity?
In a validation study, a test designed to predict future academic success is found to explain additional variance in GPA beyond that explained by high school grades. This refers to:
A psychologist measures test-takers’ anxiety levels and compares them with GPA data collected six months later. Which type of validity is being evaluated?
Which validity threat occurs when a test does not cover important aspects of the construct it aims to assess?
A study uses random assignment to increase the control over variables in the experiment. What is being increased?
In a multitrait-multimethod matrix, high correlation between the same trait measured in different ways indicates:
A researcher uses statistical analysis to determine if certain items on a psychological inventory group together under common traits such as 'impulsivity' or 'risk-taking.' This method is best described as:
In confirmatory factor analysis (CFA), what is being tested?
Which of the following would most likely be evidence of bias in a psychological test?
In an employment rating scale, an evaluator consistently gives mid-point scores to all applicants, even if their performances vary widely. This rating error is known as:
A psychologist validates a test with a college sample, then attempts to re-validate it using working adults. The drop in the validity coefficient is known as:
Which of the following must be present before you can make a valid interpretation from a test?
A rater consistently gives very high scores regardless of an applicant’s performance. This is most specifically referred to as:
Which of the following would be least helpful in minimizing bias in test development?
What is the primary difference between co-validation and co-norming?
A test developer notices that test scores tend to correlate highly with one specific unrelated personality trait. What does this likely indicate?
An HR manager is evaluating whether adding a psychological test to the hiring process improves the quality of hires. She uses tables that estimate how the test will improve correct decisions over chance, based on base rate and selection ratio. Which method is she most likely using?
In applying the Angoff Method, which of the following limitations must be considered when setting fixed cut scores?
A company wants to reduce costs by using only one test instead of multiple, yet still selects applicants with acceptable proficiency levels in both intelligence and personality. Which model best suits this?
A psychologist sets a cut score based on data gathered from two groups—one known to possess a clinical trait and one that does not. This method is referred to as:
A company has 20 applicants and 5 positions. What is the selection ratio?
Which statement about utility gain is MOST accurate?
An organization applies a multiple hurdle model in recruitment. Which of the following is true of this approach?
Which of the following is an appropriate use of expectancy tables in utility analysis?
What is a major limitation of Taylor-Russell Tables?
In a study, a researcher compares job performance of applicants who were hired (selected group) versus those who were not (unselected group). The difference in mean performance is used to assess test utility. This is an example of:
A test item in a national licensure exam yielded a discrimination index of 0.45. Which of the following is the most appropriate interpretation?
A test developer obtained an interrater reliability coefficient of 0.00 from two scorers. What does this imply about the scores they produced?
An item on a multiple-choice exam was correctly answered by 90% of students. What is the difficulty index (p-value), and what does this imply about the item?
A researcher reports a test's reliability coefficient as 0.63 in a clinical setting. What conclusion is most appropriate?
Which of the following item discrimination indices would suggest the best-performing item in a norm-referenced test?
What does a p-value of 0.045 indicate in hypothesis testing of test item performance?
A test has a validity coefficient of 0.80 and reliability coefficient of 0.90. Which statement is most accurate?
The inter-item reliability index of a newly constructed scale is 0.85. What can you infer from this result?
Which of the following would most likely increase the reliability coefficient of a psychological test?
In a test validation study, the validity coefficient dropped from 0.78 to 0.51 when applied to a new group. What concept best explains this?