In response to our commentary on the OPAT, Occupational Physical Assessment Test, we’ve received a lot of interesting comments, including some that wanted more technical details than what we provided.
The purpose of this post is to provide you with primary documents that help you understand this test. Now, these documents exist on two planes; one is the bare words on the document, which are written to give an impression of scientific objectivity and leadership impartiality; the other is the subtext, for these documents all are written with a view to support the political Party Line with respect to combat fitness, a Party Line that we think everyone understands.
Instruction on how to administer the OPAT is here [.pdf]. It does not answer the question some had on what intervals are used on the beep test, as the beep test audio is a computer file provided separately. Here is another set of instructions[.pdf] with illustrations and some differences — for example, the first forbids use of d-handles on a trap bar, the second shows a soldier using the d-handles.
This next document suffers from the antimnemonic and counterinformational name, USARIEM Technical Report T16-2. [.pdf] The subtitle, however, does express its intent: Development of the Occupational Physical Assessment Test (OPAT) for Combat Arms Soldiers. This document purports to explain how the OPAT was developed, although at the time of this document (Oct 2015) it was just intended to classify soldiers for Combat Arms specialties (specifically Combat Engineers (12B), Field Artillery (13B, 13F), Infantry (11B, 11C) and Armor (19D, 19K)). If you are Army, you will note that these are all enlisted MOSes, and have no bearing on the You Go Girl! careerist officer contingent behind all this social engineering, but that’s neither here nor there. In most of those fields and in most activity domains (although not, perhaps, endurance, which this test can’t measure) the physical demands on the enlisted soldier are greater than on his officers.
Setting up a bifurcation in the physical duties of the infantry leader and his subordinates leads one to the South American army, where privates dig the officer’s foxhole and carry the officer’s rucksack. This division of power is why the world quakes before the military prowess of Bolivia, for instance. So the Army is likely to pay at least lip service to the idea of making officers and enlisted soldiers meet similar standards.
That the Social Justice Warriors are behind this document is very clear:
A number of studies have shown, however, that [the Army Physical Fitness Test] score is not highly correlated with the performance of the physically demanding tasks performed by Soldiers. Furthermore, the APFT score includes adjustments for age and sex, not only biasing for/against certain groups, but making it potentially legally indefensible if used as a screening tool for entrance into certain MOSs. Since it is not practical to test Soldiers performance of physically demanding tasks prior to entering an MOS, criterion-based physical performance tests (i.e., tests that are predictive of soldiering task performance) are essential if the Army wishes to establish valid standards to select Soldiers for an MOS.
Translated to plain English, that means: the APFT doesn’t measure anything except ability to do push-ups, sit-ups, and 2-mile run; a test score can’t be used as an MOS cutoff anyway, if it’s age-normed and sex-normed; and that the Army can’t measure, say, the ability of a would-be cannon cocker to pick up cannon shells, so they need some test that will predict whether Soldier X has this capability before they let him strike for a cannon-cocker job. (They just assume they can’t measure something like shell-handling capability… and then, later in the paper they use tests like that to “validate” their proposed OPAT!) So they conclude that, rather than test soldiers on a realistic test, some stylized, abstract and formalized test will test Soldier X’s readiness for picking up artillery shells better than pointing him at an artillery shell and saying, “Mongo, lift!” will.
In depth job analysis revealed that five of the seven MOSs (11B, 11C, 12B, 13F, 19D) had similar critical physically demanding tasks, while two MOSs (13B and 19K) had additional or different tasks with heavy physical demands. In order to reduce costs, simplify and streamline testing, additional analyses were run to determine if a common battery of physical performance tests could be used for all seven MOSs without a large loss in the predictive capability.
That’s the source of the next assumption. They selected 23 difficult physical things, in all, and if someone could do those, then they could probably do everything the MOS required.
Remember why they’re doing this (emphasis ours)
[T]hree courses of action for gender neutral Occupational Physical Assessment Tests (OPATs) were developed for seven combat MOSs.
They did not follow through and see whether their selectees then actually could perform the combat arms job, and this was done entirely by lab boffins without any visible input from people who actually have done combat arms jobs, let alone have done them in combat, which hasn’t exactly been in short supply for the last decade and a half. But they did compare how performance on three preselected batteries of tests related to performance of the 23 tests that they decided, based on zero experience, were the edge conditions of combat arms service.
When the tests were chosen, the standards were initially set by the proponencies (i.e. the schools that train soldiers in those specialties). However, the standards were reset, lower, by the Natick boffins, based on the performance of actual soldiers. If 90% of the soldiers they tested, say, in MOS 11B, could not complete some 11B task, the standard was reset at whatever level it took to get a 90% pass rate. Thus, the claim that this test is based on the needs of the MOS is only true if you accept that the boffins are the best judge of what the MOS needs, and the performance of a set of soldiers should take primacy over the tasks they need to do). This is one of many examples where the development of the test was biased towards the command’s desired lower standards.
(It’s not unusual for a soldier who’s no good at one physical activity to be accepted by his peers based on other performance strengths, a classic example being the strongman who’s a poor runner. But by lowering the standards in this manner, the boffins assured that the test is a performance test of everybody’s weakest event, and ensures that, for instance, a weak man who’s also a poor runner will be accepted, because the criterion is set by the weak in every event).
While the boffins lowered standards, they did not raise any if the soldiers of the MOS outperformed the school standard. This is another example of the bias towards lower standards. There are many more such examples.
The three test batteries were:
- medicine ball put, squat lift, beep test, standing long jump, arm ergometer
- medicine ball put, squat lift, beep test, standing long jump.
- standing long jump, 1-minute push-ups, 1-minute sit-ups, 300 m sprint, Illinois agility test
The most predictive of performance on the 23 tests was Test 1, which edged out test 2, on what was apparently the sole criterion of evaluation, an R2 test of predictive value (R2 tests have their own issues); Test 3 was not significantly predictive of performance on the 23 tests (r2 as low as 0.58). Tests 1 and 2 were both about 80-90% predictive, according to the paper, and Test 2 was cheaper to administer, so the Army chose Test 2.
They went to validate the test against the criteria developed, originally, by the proponencies, with the standards lowered by the boffins as described above. However, they decided that some of the tests just were too much trouble:
Some of the tasks were not collected due to either a large skill component (as the hand grenade throw) or the duplication of the physical demands with another task (multiple foot marches).
You might be excused for suspecting that this was just one more in the many examples of bias towards lower fitness standards that permeates this entire project.
They compared their approach to the pre-recruiting tests used by many foreign nations. They are dismissive about some foreign tests:
Predictor tests range from those highly faithful to the original task, such as the weight load march and jerry can carry of the Australians….
They certainly didn’t like that antipodean idea. They never considered anything like it.
That’s how the OPAT took shape, and now we see the result.
Even as the Army prepares to send the physically feeble to Combat Arms, recruiters have a feebler and feebler cohort of young civilians to choose from, according to the very same “beep test” used in the OPAT:
America’s kids ranked 47 out of 50 countries measuring aerobic fitness — a key factor for overall health — in a study published in the British Journal of Sports Medicine. By comparison, Tanzania, Iceland, Estonia, Norway and Japan raced away with the top five slots. The least fit country: Mexico.
Gee, would some degree of the American decline be explained by the gradual replacement of Americans by those unfit Mexicans? (You’d suck at aerobic activity, too, if you had to breathe Mexico City air. The city’s in a bowl; it’s like LA without movie-star sightings and other vestiges of the dying SoCal culture).
Research teams from the Children’s Hospital of Eastern Ontario and the University of North Dakota analyzed data on more than 1.1 million kids aged 9 to 17. Subjects were evaluated using a multi-stage fitness test also known as the “beep” test. How it works: You run back and forth between two points 66 feet apart to synchronized beeps. The point where you can’t reach the line before the beep, that’s your level.
…and that luckless recruiter just knows that the few superior examples are probably going to choose the less-Social Justice Warrior infected Marine Corps, or the “if you’re not going to be in a fighting service, you might as well take it easy” Air Force.