1. Introduction
The human factors of weather forecasting have become a concern in recent decades, related to rapid advances in workstation and display technology (e.g., Doswell 2004; Hoffman 1991). There have also been more studies of the value added by the human forecaster, relative to advances in computer modeling capabilities, reflected in efforts to improve methods of evaluating forecast skill (e.g., Bosart 2003; Roebber 1998; Stephenson 2000). Also closely related to human factors considerations, recent decades have seen a burgeoning of interest on the part of educators, psychologists, sociologists, and others in the nature of expertise (e.g., Ericsson et al. 2006; Feltovich et al. 1997), and this includes an interest in expertise at weather forecasting (Roth 2004; Scott et al. 2005).
One of the generally accepted operational definitions of expertise is consistently superior performance relative to individuals with less experience. Beyond performance measurement, research in the emerging field of expertise studies has revealed the key features of expert learning and cognition that distinguish novices, apprentices, journeymen, and experts in terms of the ways that their practice is an “art” or “craft” (Glaser 1987; Ericsson et al. 2006; Hoffman 1998). Key features include logical reasoning on the basis of systematically derived techniques, the formation of conceptual or mental models, the ability of experts to perceive meaningful patterns that nonexperts cannot perceive, the ability to cope with rare or difficult weather situations, the ability to quickly recognize situations and adapt on the fly, and the role of motivated, deliberative practice. One of the most salient defining features of expertise, in all studied domains, is the extent of experts' knowledge of concepts, and the organization of their knowledge. It is this aspect of expertise that is the focus of the work reported upon in this note.1
As in other “complex sociotechnical contexts” (Hoffman and Woods 2000; Vicente 1999), weather forecasting on the part of professionals in organizations such as the National Oceanic and Atmospheric Administration (NOAA), the military, and private sector forecasting services involves collaboration, and a reliance on information technologies to display and analyze multiple data types. Practitioners attempt to predict dynamic, complex events, often under conditions of time pressure, data overload, high stakes, and high risk. The field of practice is always a “moving target” insofar as changes in technology and understanding mandate continual learning on the part of practitioners (Ballas 2006).
In many such work contexts, knowledge and skill have become widely recognized as an increasingly important asset (Klein 1992). Importance comes from the fact that expertise is a “must” for proficient performance, especially for coping with rare, difficult, and ill-defined problems. Increasing importance comes from recognition that many of the most knowledgeable personnel are nearing retirement. Numerous examples can be given of private sector and government organizations that have discovered—either the hard way or too late—that knowledge was a corporate asset (Brooking 1999; Hoffman and Hanes 2003; McGraw and Seale 1988; O'Dell and Grayson 1998).
The System To Organize Representations in Meteorology-Local Knowledge (STORM-LK) was created to illustrate a method for eliciting and representing knowledge about weather in the Gulf Coast region.
2. The research participants
Participants in the project included 3 civilian forecasters, 17 aerographers (including both petty officers and enlisted personnel) who had qualified as forecasters, and 2 aerographers who had qualified as observers. The participants all worked at the Naval Training Meteorology and Oceanography Command Facility (NAVTRAMETOCFAC) at the U.S. Naval Air Station in Pensacola, Florida (NASP).
A first step in the research was to attempt to evaluate the experience, knowledge, and forecasting ability of each of the participants. One purpose of this activity was to identify forecasters for participation in the knowledge elicitation procedure. Proficiency levels of the participants were estimated by career interviews, an evaluation of forecaster performance, and a study of reasoning styles.
a. Career interviews
In the career interviews, participants were guided through a thorough discussion of their education, training, and professional forecasting experiences. Their career was reviewed in successive waves, going into greater detail in each wave. We discussed mentoring opportunities, and also collateral duty assignments (which would mean time not spent at forecasting-related tasks). The final result included an analysis of breadth and depth of experience (e.g., number of duty assignments in different climates, experience on ship as well as on land, opportunities to be mentored, etc.).
Our evaluation of experience included an attempt to estimate the amount of time each participant had spent at forecasting and forecasting-related activities. From the literature on the psychology of expertise comes the rule of thumb that it takes a minimum of 10 000–14 000 h (or about 10 yr) of practice and experience to achieve expertise (Chase and Simon 1973; Ericsson et al. 2006). Practical difficulties always arise in any attempt to estimate how much practice an individual has had, for any domain of professional activity. But for some domains it is possible to generate reasonable estimates. Our effort was facilitated by the fact that METOC forecasters and aerographers work in duty shifts, and on each shift have to conduct a specified set of forecasting tasks.
For each participant we generated a listing of training and professional experiences, and for each experience we developed estimates of months, days, and hours dedicated to forecasting-related activities. In some cases, averages were entered into the calculations (e.g., 8-h watches mostly, but some 12-h watches when at sea, for say, an average of 10 h per watch while at sea). An example set of excerpts from the career interviews is presented in Table 1.
We found that the civilian forecasters, whom we assumed from the outset would be experts, had between 31 000 and 55 000 h of forecasting experience, and individuals whom we tentatively identified as journeymen (primarily naval officers) had between 11 000 and 29 000 h. Understanding that these are rough estimates, we might apply what is referred to as a Gilbreth correction factor, named after industrialist and efficiency expert Frank Gilbreth (1911). In his work between the two World Wars, he observed that when people are at work (e.g., manufacturing assembly jobs), they often spend only about half of their time actually conducting their job tasks. For the present data, this highly conservative adjustment still would place the civilian forecasters, and a few of the naval officers, well above the 10 000-h benchmark.
One would expect that individuals with less experience (e.g., only a few entries in their career interview table) would be less likely to have achieved expertise, whereas those with more row entries (e.g., 15 or more) would be older individuals having much more experience (and more opportunities to have had different kinds of experiences) and, hence, would be more likely to have achieved expertise. However, experience per se is not sufficient for the achievement of expertise in any domain, including forecasting (Pliske et al. 2004). To demonstrate how one might converge on a determination, we also looked at performance data that were available to us.
b. Performance evaluation
Forecasting performance was evaluated using the U.S. Navy system, which calculates the percentage of forecasts and forecast amendments (for a set of wind parameters including wind speed, wind direction, visibility, precipitation, etc.) that verified according to each parameter's correctness range. Terminal Area Forecast (TAF) data were available for 8 of our 22 participants, spanning 1995–99. (Some of the participants were qualified forecasters but were not producing forecasts in this time period because they had other duty assignments.) For each participant, we created a spreadsheet listing for each month the number of forecasts and forecast amendments. Next, the monthly percentage of all verified forecasts and amendments for each forecaster was calculated. We then calculated each forecaster's average monthly percentage, determined the range and standard deviations of the monthly percentages, and calculated the average of the monthly averages across all of the months. Last, we calculated separately the averages for the four participants who had been tentatively designated as expert in the career interviews and for the four who had been tentatively designated as journeymen.
Performance percentages averaged 85 (range of 82.4–87.1) for the experts and 81 (range of 78.1–83.8) for the journeymen. Although the sample size meant that any statistical test we might bring to bear would have relatively low power, the difference was in the expected direction. A more important caution is the fact that the forecast verification data that were available to us reflect forecaster performance only in a limited sense. There is a built-in amendment procedure that would necessarily increase the percentages of forecasts that verify. Furthermore, the verification data do not gauge the value added by the human forecaster in the way that skill scores can.
However, the lowest forecast verification scores were those of individuals having less local experience, as determined in the career interviews. Furthermore, we also looked at thunderstorm season forecasting (May–June), which was generally regarded as being difficult and requiring skill on the part of the human forecaster, over and above that involved in forecasting on the basis of persistence or on the basis of computer model outputs. The averages were 83.4 (range of 98.48–73.64) for the individuals whom we tentatively designated as experts versus 80.3 (range of 83.93–52.27) for the individuals whom we had tentatively designated as journeymen.
c. Reasoning styles
The third means of converging on a proficiency scale involved the use of the cognitive modeling procedure (CMP; R. R. Hoffman et al. 2000, unpublished manuscript). In the first step of this two-step procedure, the participants describe their reasoning by creating a flow diagram–like model of their strategies and procedures. In step 2, conducted a week or more later, the forecaster is observed from the moment they arrive at the beginning of their period of watch. It is possible to behaviorally validate some of the claims expressed in their reasoning diagram. For instance, a forecaster might have said in step 1 that they observed the sky as they walked across the parking lot, or that the first thing they do on arrival at the facility is inspect the satellite loop to get the “big picture,” or that after examining satellite and radar data they inspect the model of the day. Other claims in the reasoning model could not be behaviorally validated (e.g., hypothesis testing based on the forecaster's mental model of the weather situation) but these can be subject to the judicious use of probe questions (e.g., What are you doing/thinking now?). The CMP results in validated high-level models of reasoning, generated with less time and effort than the most commonly used procedure that cognitive scientists use to study problem solving, the “think-aloud” method (Ericsson and Simon 1984; Hoffman et al. 2000, 2002).
The CMP results with the METOC participants indicated that the individuals whom we had tentatively identified as apprentices and journeymen were more likely to assert that they produce forecasts by relying on persistence and the “model of the day.” The individuals whom we had tentatively identified as experts were more likely to describe how they engage in reflective thinking, approaching forecasting as a form of exploratory hypotheses testing. For instance, they were more likely to describe how they take model biases and seasonal factors into account, and were more likely to vary the order in which outputs of various computer models are examined as a function of the season and the weather situation of the day.
We concluded that all of those individuals whom we had tentatively identified as experts and as journeymen were proficient, at least in the sense in which the term “journeyman” was used in the traditional craft guilds; that is, they could produce competent forecasts without supervision. We also concluded with some confidence that the civilian forecasters whom we had tentatively identified as experts would qualify as experts, as this term is operationally defined in the literature of expertise studies.
3. Procedure
a. Preparation
Prior to conducting the knowledge elicitation procedures, the researchers had to familiarize themselves with the facility, organization, and forecasting procedural guides, even though the researchers were already conversant in the domain of weather forecasting (Hoffman 1991; Hoffman and Conway 1990; Hoffman et al. 1993). This familiarization process included the following steps:
In the work space analysis step, the researchers and a qualified aerographer visited each workstation/work area and discussed the activities conducted, the resources needed, communication and collaboration patterns, and so on.
During the observations of weather briefings step, we witnessed and audio recorded a total of nine briefings that included briefings to pilots, briefings to pilot trainers, and briefings internal to the METOC forecasting staff. The transcripts were analyzed for propositional content concerning weather concepts and phenomena.
For the documentation analysis step, two of the researchers and each of three aerographers (who qualified as forecast duty officers or subregional forecast officers) reviewed the facility's standard operations procedures (SOP) documents. The facility had 58 SOP documents, and the interviews went into the greatest detail concerning those that were most pertinent to forecasting: for example, the procedure for selecting and displaying products on the Satellite, Alphanumeric, NEXRAD and Difax workstation display; the procedure for adding Significant Meteorological Information [from National Weather Service (NWS) products] to the home page weather chart; and the procedure for issuing forecasts for Pascagoula Bay. For each SOP, the participants indicated which designated officer conducted each procedure, when, and why, and for each procedure the aspects that made it easy and the aspects that made it difficult.
Having identified the individuals whom we designated as proficient (i.e., experts and journeymen), and having familiarized themselves with the organization and its standard operating procedures, the researchers were prepared to conduct the knowledge elicitation procedure, which relied on a form of diagramming.
b. About concept mapping
The task of creating a diagram can facilitate the learning of concepts and concept relations, and diagrams created by learners can be used to evaluate knowledge (Novak 1998). In the field of human factors engineering, diagramming has been proven useful as a procedure whereby domain practitioners describe their knowledge and reasoning (Cooke and McDonald 1987; Gordon et al. 1993).
The literature on diagrammatic reasoning (research in education, cognitive science, computer science, and geography) includes reports on studies of how people understand a great many types of diagrams, including topographic maps, matrices, schematic diagrams of machines, and semantic networks. [A comprehensive review appears in Vekirl (2002); see also Ausubel (1960), Glasgow et al. (1995), and Mandl and Levin (1989).] Good diagrams are effective because they “externalize” cognition and they guide/constrain and facilitate cognition by supporting inference making. They have mnemonic value and reduce cognitive demands by enabling information integration at a glance (as opposed to overloading working memory) by shifting some of the burden of text processing onto the visual perception system. Diagrams that work well are ones that rely on proximity. The spatial organization or connection of information units induces people to see the units as being related, and makes people likely to attempt to draw inferences about the relation.
These are all features of a type of diagram called a concept map (Novak 1998). Concept maps are meaningful diagrams that include concepts (enclosed in boxes) and relationships between concepts or propositions (indicated by labeled connections between concepts). Concept mapping has its foundations in the theory of meaningful learning (Ausubel et al. 1978) and a background of decades of research and application, primarily in education (Novak 1998). Concept maps are being used by groups as disparate as school children throughout South America, astrobiologists at the National Aeronautics and Space Administration, curriculum designers in the U.S. Navy, university professors preparing distance-learning-based courses, trainers in the electric power utility industry, and businesses where focus groups create concept maps for brainstorming (Briggs et al. 2004; Cañas et al. 2003; Gaines and Shaw 1995; Hanes and Gross 2002). Concept maps have been used as knowledge representations in cognitive science (Dorsey et al. 1999; Gordon et al. 1993). Concept maps made by domain experts can be used to show agreements and disagreements (see Gordon 2000). Furthermore, concept maps have been used as the basis for the explanation component of knowledge-based systems and performance support systems (Cañas et al. 2003, Coffey et al. 2003; Dodson 1989; Ford et al. 1996; McNeese et al. 1993; Sutcliffe 1985).
Reviews of the literature on concept mapping, discussion of methods for making concept maps, and discussion of the differences between concept maps and other types of meaningful diagrams can be found in Cañas et al. (2003, 2004), Coffey and Hoffman (2003), and Crandall et al. (2006). Although concept mapping has the variety of applications we have detailed here, it is by no means a tool for all purposes. Its primary strength lies in the creation and representation of knowledge about domain concepts. That is the purpose to which it was applied in the present project.
c. Knowledge elicitation procedure
The knowledge elicitation interviews were supported by the use of CmapTools, a software suite created at the Florida Institute for Human and Machine Cognition (free download available online at http://cmap.ihmc.us/download/). CmapTools has a simple interface that guides the user in the creation of concept maps using simple point-and-click and drag-and-drop operations.2
In the knowledge elicitation interviews, one researcher stood at a screen and served as the facilitator while another researcher worked at a laptop computer and created the concept map, which was projected onto a screen. Referring to the projected concept map as it developed, the facilitator helped the forecaster build up a representation of domain knowledge of weather concepts, by suggesting alternative phrasings for concepts and propositions. The facilitator avoided imposing ideas or word choices.
Participants in the concept mapping sessions were the eight individuals (out of our pool of 22) who were qualified to produce forecasts. We did not engage the less experienced aerographers or any of the observers in the concept mapping sessions. The group of eight included four individuals who had been designated as experts on the basis of the career interview and analysis of performance data—two of the three civilian forecasters and the command master chief petty officer. The group of eight also included chief petty officers and petty officers who had been designated as journeymen on the basis of the career interviews. As a group, the eight had been on station at NASP for a range of 1–6 yr, averaging about 3 yr. Three had been on station for less than 3 yr.
A majority of the concept maps that were eventually included in the final STORM-LK knowledge model had been created and/or refined by four of the participants, three of whom had been designated as journeymen and one of whom had been designated as an expert. One was a chief petty officer, one was the command master chief petty officer, one was the command duty officer, and one was the commanding officer. Although the individual designated as expert (the command master chief) had only 1 yr on station at NASP, he had over 14 yr of experience forecasting tropical weather. A majority of the concept maps that were included in the final STORM-LK knowledge model were drafted in elicitation sessions that involved two participants, one designated expert and one designated as a journeyman. On the basis of the career interview we had designated the commanding officer as a journeyman. He had authored the Local Forecasting Handbook used by the U.S. Navy in the Pacific. His main role was to assist in the process of finalizing and approving the concept maps.
The initial concept mapping sessions were aimed at breadth, rather than depth, resulting in concept maps of a total of 154 topics in weather forecasting. For example, we created a number of concept maps about the Next-Generation Doppler Radar (NEXRAD), and about numerical models. Either of those could have served as a topic for a knowledge model. Instead, we focused on capturing expertise at forecasting weather phenomena that are important in the Gulf Coast, including regional seasonal tendencies, fog, turbulence, tornadoes, thunderstorms, and hurricanes.
4. The resulting knowledge model
It took about 1.75 h to create, refine, and validate each of the two-dozen concept maps that eventually formed the core of the STORM-LK knowledge model. The concept maps contained an average of 46 propositions, which experience has shown is an appropriate level of detail for individual concept maps as viewed on the typical computer monitor. An example is presented in Fig. 1. This is the “top map,” the concept map that one sees when entering STORM-LK. Nodes refer to the main concepts that are elaborated upon in other concept maps in the knowledge model.
The CmapTools software indicates hyperlinks by the small icons underneath concept nodes. Below some of the concept nodes in Fig. 1 are small icons that look like concept maps. Clicking on such an icon takes one to the concept map indicated by the node. From the top map one can navigate among all of the other concept maps in the knowledge model, by clicking on the resource icons. For example, clicking on the icon under the Thunderstorms node in the Fig. 1 concept map takes the user to the concept map that appears in Fig. 2. From the top node in every concept map one can return to the top map. Any concept in a given concept map that also appears in another concept map is hyperlinked to that concept map.
Not only can such hyperlinks be used to stitch a group of concept maps together into a navigable knowledge model, but they can also be used to hyperlink digital resources such as text documents, images, video clips, and URLs. For example, nodes in the STORM concept map about “Forecasting Tools” link to NOAA Web sites and products, NEXRAD Web sites, NWS Web sites, and the Aviation Digital Data Display service. Resources added into the STORM-LK knowledge model included diagrams and text pieces taken from the Local Forecasting Handbook (LFH). Indeed, STORK-LK includes all of the information from the NASP LFH. Finally, we also resourced STORM-LK with extended case studies in which the experts described previously encountered difficult examples for each of the most important regional weather phenomena (e.g., fog, thunderstorms, tornados, hurricanes).
Figure 3 is a screen shot showing some of the resources—satellite imagery, computer model forecasts, and digital video in which the domain expert provides explanatory statements for concepts. All of the concept maps in STORM-LK can be browsed online (http://www.ihmc.us/research/projects/StormLK/).
5. Evaluation
a. Validation of the knowledge model
To demonstrate one possible procedure for validating the content of the knowledge model, we solicited the assistance of a retired U.S. Navy chief petty officer who had served at the NAVTRAMETOCFAC at NASP for about 4 yr and had considerable breadth of experience. He was a junior expert on our proficiency scale in terms of hours of experience (about 30 000). He went over each of the concept maps, commenting on each proposition and suggesting changes. Validation took about 7 min per concept map, on average.
The results conformed to our experience in validation procedures, that about 10% of the propositions were modified. (Since concept maps include cross links and branchings, change in any one concept node or any one linking relation can entail changes in more than one proposition.) Some of the changes involved important subtleties, for example, “X causes Y” versus “X facilitates Y.” Some of the changes seemed like wordsmithing. For example, the proposition “Dryline which acts like a frontal slope” was changed to “Dryline acts like a frontal slope” (although the change in the scope of the qualification might be regarded as both subtle and important). The main point is that we found little in the way of what might be regarded as outright disagreement (i.e., statements to the effect that “This proposition is wrong”).
An analysis of the correlations between the numbers of concepts (and numbers of propositions) in the concept maps and the numbers of changes made in concepts, linking relations, and propositions showed there was no apparent bias to make more changes in larger concept maps or fewer changes in smaller concept maps. Just because a concept map was simple, that did not mean it was immune from refinement, and conversely, just because a concept map was complex that did not mean that it was inherently more likely to need refinement.
The judgments of a single evaluator should not be considered as necessarily final or definitive, of course, but a validation procedure such as the one outlined here should certainly be conducted for all knowledge models.
b. Efficiency of the procedure
The STORM-LK project was methodological in focus, with the primary aim of demonstrating a procedure for efficient capture and representation of knowledge of weather concepts. Our evaluation of efficiency relied on the notion of “informative propositions per total task minute.” This depends on an anchor for informativeness.
In the “first generation” of expert systems, computer scientists found that it took a great deal of time to interview experts to create a knowledge base of domain concepts and their definitions (Cullen and Bryman 1988). This stemmed largely from their sole reliance on an analysis of documents and unstructured interviews. Hoffman (1987) compared a number of knowledge elicitation methods in terms of their yield of propositions that were “informative” in the sense that they were not already in a knowledge base derived from a preliminary analysis of documents (texts, procedural guides, etc.). Efficiency was calculated relative to “total task minute,” which included the time taken to prepare to run the procedure, the time taken by the procedure, and the time taken to generate a final representation. The determination was that knowledge elicitation procedures, especially unstructured interviews, often yield quite a bit less than one informative proposition per total task minute (Hoffman et al. 1995).
For the present study, we regarded a proposition as informative if it was useful in the concept maps about the main Gulf Coast weather phenomena. To gauge efficiency, we compared the concept mapping procedure with a number of other procedures that are now widely used in human factors engineering for knowledge elicitation and work domain analysis. With our NAVTRAMETOC participants we conducted a number of “critical decision method” procedures (Hoffman et al. 1998) in which practitioners retrospect about previously encountered difficult cases. We also conducted “think aloud” problem solving procedures: the “knowledge audit,” the “recent case walkthrough,” and work space and work patterns analysis. [For details on these methods, see Crandall et al. (2006) and Hoffman et al. (2002).]
The concept mapping procedure yielded about two informative propositions per total task minute. If one takes into account the fact that for the concept mapping procedure there is essentially no preparation time and the result from a well-conducted session is close to being the final product, it can be safely concluded that concept mapping is as at least as efficient at generating models of conceptual knowledge as any other method of knowledge elicitation that is commonly used today in human factors engineering.3
c. User reactions
The statement of work for the STORM-LK project did not extend to any empirical or experimental effort to evaluate any of the particular potential applications of the knowledge model—the primary purpose of the STORM-LK project was to demonstrate the procedure. Nevertheless, we took the initiative to make STORM-LK available to the three aerographers who arrived at the NAVTRAMETOCFAC shortly after the knowledge model was completed. They were invited to peruse the concept maps and the appended resources, and to render their judgments regarding STORM-LK's potential as a learning aid as they prepared for their qualification tests. They were asked not to discuss their reactions with one another. Over the subsequent 4 months we conducted 11 interviews with the aerographers.
As we expected, initial responses indicated skepticism and surprise (e.g., “Never seen anything like it.”). One of the aerographers had more forecasting experience, and had previously qualified at three other duty stations using the LFHs. This individual felt more comfortable using that traditional document with its standardized format and organization:
I do not like the format. . . . I want a straightforward table of contents. I do not like to read text off a computer screen. I always print out the stuff I think will be useful from the LFH. STORM makes it harder than the LFH to do this. Even this aerographer, however, acknowledged that: “the CD has some fantastic information. It is helpful to someone who is trying to qualify.”
The two (younger) aerographers who were of the “Web generation” found the concept maps interesting, almost like a video game, and regarded them as an invitation to follow the links and look at the resources. They found the digital videos with the experts' mini-tutorials to be especially memorable, which one of them demonstrated by recounting the expert's discussion of the key considerations in thunderstorm forecasting.
Subsequent to their having qualified, one of the aerographers reported that he relied heavily on the knowledge model. He was quite emphatic that exploration of the knowledge model helped in his learning effort. His reports also suggested that the model encouraged active learning:
The links to Internet sources were useful. Once I got used to the way the program was set up, I used it. . . . It discussed a lot of the local weather effects like topography and was linked in more than one way. For example we talk about ‘Interstate-10’ storms. They were discussed in a couple of different ways—in the material on convective weather and in the summer regime. It was discussed in more than one context. I did make heavy use of it. I think the others will make heavy use of it. Right now they probably are like I was at first—the initial ‘blinders’ or their expected paradigm. I'm through that now, yes. Oh yes. I had quite a few blinders. I wasn't expecting the CD to live up to my expectations, but now I'm beyond those blinders.
These subjective responses are suggestive of the need to adapt knowledge representations as appropriate to learners of differing degrees and kinds of experience.
The STORM-LK prototype demonstrates the feasibility of using the concept mapping approach to efficiently capture practitioner domain knowledge in models containing dozens of concept maps and hundreds of propositions. This could easily have reached to scores of concept maps, thousands of propositions, and hundreds of multimedia resources had we included other topics, such as model biases, ensembles, radar, etc.
6. Potential applications
Once captured, the knowledge must be put to some use. A likely application of knowledge models such as STORM-LK would be to knowledge management. The NAVTRAMETOCFAC was recently downgraded to a detachment, and the civilian forecasters took the opportunity to retire. Were it not for the knowledge we had gathered in our study, all of their experience and strategies would have been effectively lost to the METOC community—and that includes descriptions of domain knowledge and reasoning that were not discussed in the LFH.
Knowledge models such as STORM-LK might best be used as adjuncts to LFHs. A LFH in concept map form can serve as a living electronic document that can be continually refined and updated (Coffey and Hoffman 2003). Through CmapTools, one can always create new concept maps, move or change links or nodes, etc.
Knowledge models such as STORM-LK might be a useful adjunct to existing efforts to support distance learning (as in Mostek et al. 2004). One of the strengths of concept maps is that they provide rich information at a glance, without some of the constraints imposed by the traditional linear prose format. Understanding local regimes as described in a LFH can require page turning, index searches, and so on. Work with a concept map knowledge model also contrasts with common experience at using typical Web pages, which encourage deepening at the expense of broadening (i.e., the “back” button is used most often, especially when one loses track of the browsing path). With the concept map interface, “getting lost in cyberspace” becomes less of an issue. Through the hyperlinking mechanism in STORM-LK, one can effectively get from anywhere in the knowledge model to anywhere else, in two clicks at most. While it is possible to design good Web pages using methods other than concept maps, it is also possible to make good Web pages using concept maps (Hoffman et al. 2005). (Examples can be found online at http://cmap.ihmc.us/Index.html.)
Another potential application of concept mapping involves its use in mentoring and student evaluation. The literature on educational applications shows that the concept maps made by students can be used to find gaps, weaknesses, and misconceptions in student knowledge (see Novak 1998). One of the procedures we conducted in the STORK-LK project was sociometry. This method has been used to determine proficiency scales (see Stein 1997). In this procedure, participants within an organization are asked to rate each other's skill levels and discuss each other's reasoning strategies and procedures. We discovered in the sociometric interviews with the METOC forecasters and aerographers that personnel in this organization did not share their reasoning strategies or knowledge, with the exception of periodic training briefings and some on-the-job training. It may be useful to explore the possibility that the creation of concept maps might be a valuable component of mentoring and, perhaps, even a task to be included as a part of qualification examinations.
7. Methodological limitations
All of the procedures we conducted were limited in a number of respects, some of which were practical and situational, some of which were methodological. All of the procedures we conducted were intended as demonstrations of possibilities.
a. Proficiency scaling
With regard to proficiency scaling, for example, one might attempt to map skill scores onto operational definitions of levels of proficiency from the literature on expertise studies. We believe that at least two of our participants possessed deep conceptual understanding, and that a number of our participants were able to recognize anomalous situations, and understood the limitations of numerical guidance. On the other hand, we also had some participants who seemed to fall into the category that Pliske et al. (2004) referred to as “disengaged proceduralists.” They would issue competent forecasts and demonstrate a knowledge of meteorology, but showed no signs of motivation to push themselves to acquire greater skill. There remains a need to improve on the accuracy and efficiency with regard to methods for identifying such individuals.
b. Model validation
Our attempt to verify the propositions contained in the concept maps is illustrative of how knowledge representations of this sort have been validated through peer consensus, but there are other ways of approaching the matter. Even though both the general utility and reliability of the concept mapping methodology are substantiated in the literature, there is still a need to tailor the procedure and the representations to the unique features of the domain of forecasting.
c. Eliciting and representing procedures
There are significant challenges and issues involved in eliciting and representing forecaster reasoning and procedures, in contrast with the domain content knowledge that was our focus. The CMP procedures we conducted resulted in many interesting reports that are suggestive of the ways in which forecasting is an art. We had participants who showed signs of being reflective and deliberative. One told of how he would use the feel of the air on his razor-irritated skin to sense the atmosphere. Another told of how he would go up to the airfield observation deck to feel the iron railings for cold clamminess, to help assure him of the correctness of his forecast of fog formation. In one of the case studies that we detailed using the critical decision method, the forecaster recounted how she predicted when cloud cover would lift to a level that would permit training flights: She would look out a window toward the downtown and watch to see how the ceiling lifted over time by counting the number of floors visible on a tall building. Such a heuristic would not be an acceptable addendum to a formal standard operating procedure document, but is indicative of the craft of forecasting.
Upon first sight, many people react to concept maps by saying that they are process or flow diagrams. As we have pointed out, the strength of concept mapping lies in the expression of knowledge of domain concepts. But since one of the things domain practitioners know about is their procedures, these are “fair game” for concept mapping, and many concept maps do depict processes and procedures. Although the methods we employed in our research did result in descriptions of reasoning strategies and heuristic rules (some of which were included in the knowledge model and its resources), it remains to be seen how concept mapping can be adapted to the purpose of capturing and representing forecasting processes and procedures.
d. Educational interventions
An outstanding need is for studies that use concept map knowledge models and concept mapping exercises in an educational intervention, comparing learning and performance for meteorology students who are given traditional instruction, with appropriate control for experimental demand characteristics (i.e., performance might improve because of the special treatment not because of the nature of that treatment). Such studies in other domains have shown significant and lasting gains for concept mappers (see Novak 1998). Related to this, there is an outstanding need to conduct long-term longitudinal studies of the development of forecasting expertise following education. Perhaps the most significant open question is whether knowledge representation activities of this kind will ultimately help accelerate the achievement of expertise, that is, contribute to improved forecaster performance.
Beyond concept mapping per se, further attempts to use methods of cognitive task analysis in the study of forecasting have the potential to refine the methodology of cognitive task analysis and enrich our understanding of what it means for a forecaster to be a motivated, adaptive expert.
Acknowledgments
The research was made possible through the support of the National Technology Alliance. The authors acknowledge the enthusiastic assistance provided by the administrative staff, and the dedicated participation of the operational personnel, of the Naval Training Oceanography and Meteorology Facility, Pensacola Naval Air Station. The researchers would especially like to thank Capt. Daniel J. Soper, AGCS Jerome J. McNulty, and AGC Jeffery S. Fulson. The authors also thank the consultants to the project: Joseph Novak (Florida Institute for Human and Machine Cognition), Kim Vicente (University of Toronto), Jim Richmond (U.S. Navy, Ret.), and William Clancey (Florida Institute for Human and Machine Cognition). The authors thank Mary Jo Carnot for her help in data collection and analysis, and all of the technical support provided by Alberto Cañas, Alan Ordway, and Jeff Yerkes at the Florida Institute for Human and Machine Cognition. Finally, the authors thank the three anonymous reviewers of this submission for their detailed comments and suggestions.
REFERENCES
Ausubel, D. P., 1960: The use of advance organizers in the learning and retention of meaningful verbal material. J. Educ. Psychol, 51 , 267–272.
Ausubel, D. P., Novak J. D. , and Hanesian H. , 1978: Educational Psychology: A Cognitive View. 2d ed. Holt, Rinehart and Winston, 733 pp.
Ballas, J., 2006: Human centered computing for tactical weather forecasting: An example of the “moving target rule.”. Expertise out of Context, R. Hoffman, Ed., Erlbaum, in press.
Bosart, L. F., 2003: Whither the weather analysis forecasting process? Wea. Forecasting, 18 , 520–529.
Briggs, G., Shamma D. A. , Cañas A. J. , Carff R. , Scargle J. , and Novak J. D. , 2004: Concept Maps applied to Mars exploration public outreach. Concept Maps: Theory, Methodology, Technology: Proc. First Int. Conf. on Concept Mapping, Pamplona, Spain, Universidad Pública de Navarra, 109–116. [Available online at http://cmc.ihmc.us/CMC2004Programa.html.].
Brooking, A., 1999: Corporate Memory: Strategies for Knowledge Management. International Thomson Business Press, 181 pp.
Cañas, A. J., Coffey J. W. , Carnot M. J. , Feltovich P. , Hoffman R. , Feltovich J. , and Novak J. D. , 2003: A summary of literature pertaining to the use of concept mapping techniques and technologies for education and performance support. Report prepared for the Chief of Naval Education and Training, Florida Institute for Human and Machine Cognition, Pensacola, FL, 108 pp. [Available online at http://www.ihmc.us/users/acanas/Publications/ConceptMapLitReview/IHMC%20Literature%20Review%20on%20Concept%20 Mapping.pdf.].
Cañas, A. J., and Coauthors, 2004: CmapTools: A knowledge modeling and sharing environment. Concept Maps: Theory, Methodology, Technology: Proc. First Int. Conf. on Concept Mapping, Pamplona, Spain, Universidad Pública de Navarra, 125–134. [Available online at http://cmc.ihmc.us/CMC2004Programa.html.].
Chase, W. G., and Simon H. A. , 1973: Perception in chess. Cognit. Psychol, 5 , 55–81.
Coffey, J. W., and Hoffman R. R. , 2003: Knowledge modeling for the preservation of institutional memory. J. Knowl. Manage, 7 , 38–52.
Coffey, J. W., Cañas A. J. , Reichherzer T. , Hill G. , Suri N. , Carff R. , Mitrovich T. , and Eberle D. , 2003: Knowledge modeling and the creation of El-Tech: Performance support and training system for electronic technicians. Expert Syst. Appl, 25 , 483–492.
Cooke, N. M., and McDonald J. E. , 1987: The application of psychological scaling techniques to knowledge elicitation for knowledge-based systems. Int. J. Man Mach. Stud, 26 , 533–550.
Crandall, B., Klein G. , and Hoffman R. R. , 2006: Working Minds: A Practitioner's Guide to Cognitive Task Analysis. MIT Press, in press.
Cullen, J., and Bryman A. , 1988: The knowledge acquisition bottleneck: Time for a reassessment? Expert Syst, 5 , 216–255.
Dodson, D. C., 1989: Interaction with knowledge systems through connection diagrams: Please adjust your diagrams. Research and Development in Expert Systems V, B. Kelly and A. L. Rector, Eds., Cambridge University Press, 35–46.
Dorsey, D. W., Campbell G. E. , Foster L. L. , and Miles D. E. , 1999: Assessing knowledge structures: Relations with experience and posttraining performance. Human Performance, 12 , 31–57.
Doswell C. A. III, , 2004: Weather forecasting by humans—Heuristics and decision making. Wea. Forecasting, 19 , 1115–1126.
Ericsson, K. A., and Simon H. , 1984: Protocol Analysis: Verbal Reports as Data. The MIT Press, 443 pp.
Ericsson, K. A., Charness N. , Feltovich P. , and Hoffman R. R. , 2006: The Cambridge Handbook of Expertise and Expert Performance. Cambridge University Press, in press.
Feltovich, P. J., Ford K. A. , and Hoffman R. R. , 1997: Expertise in Context. The MIT Press, 590 pp.
Ford, K. M., Coffey J. W. , Cañas A. J. , Andrews E. J. , and Turne C. W. , 1996: Diagnosis and explanation by a nuclear cardiology expert system. Int. J. Expert Syst, 9 , 499–506.
Gaines, B., and Shaw M. , 1995: Concept Maps as hypermedia components. Int. J. Hum. Comput. Stud, 43 , 323–361.
Gilbreth, F. B., 1911: Motion Study. Van Nostrand, 116 pp.
Glaser, R., 1987: Thoughts on expertise. Cognitive Functioning and Social Structure over the Life Course, C. Schooler and W. Schaie, Eds., Ablex, 81–94.
Glasgow, J., Narayanan N. H. , and Chandrasekaran B. , 1995: Diagrammatic Reasoning: Cognitive and Computational Perspectives. The MIT Press, 780 pp.
Gordon, J. L., 2000: Creating knowledge maps by exploiting dependent relationships. Knowl.-Based Syst, 13 , 71–79.
Gordon, S. E., Schmierer K. A. , and Gill R. T. , 1993: Conceptual graph analysis: Knowledge acquisition for instructional systems design. Hum. Factors, 35 , 459–481.
Hanes, L. F., and Gross M. M. , 2002: Capturing valuable undocumented knowledge: Lessons learned at electric utility sites. Seventh Conf. on Human Factors and Power Plants, Scottsdale, AZ, IEEE, 6–25.
Hoffman, R. R., 1987: The problem of extracting the knowledge of experts from the perspective of experimental psychology. AI Mag, 8 , 2. 53–67.
Hoffman, R. R., 1991: Human factors psychology in the support of forecasting: The design of advanced meteorological workstations. Wea. Forecasting, 6 , 98–110.
Hoffman, R. R., 1998: How can expertise be defined?: Implications of research from cognitive psychology. Exploring Expertise, R. Williams, W. Faulkner, and J. Fleck, Eds., Macmillan, 81–100.
Hoffman, R. R., and Conway J. A. , 1990: Psychological factors in remote sensing: A review of recent research. Geocarto Int, 4 , 3–22.
Hoffman, R. R., and Woods D. D. , 2000: Studying cognitive systems in context. Hum. Factors, 42 , 1–7.
Hoffman, R. R., and Hanes L. F. , 2003: The boiled frog problem. IEEE Intell. Syst, 18 , 4. 68–71.
Hoffman, R. R., Detweiler M. A. , Lipton K. , and Conway J. A. , 1993: Considerations in the use of color in meteorological displays. Wea. Forecasting, 8 , 505–518.
Hoffman, R. R., Shadbolt N. , Burton A. M. , and Klein G. , 1995: Eliciting knowledge from experts: A methodological analysis. Organ. Behav. Hum. Decis. Processes, 62 , 129–158.
Hoffman, R. R., Crandall B. , and Shadbolt N. , 1998: A case study in cognitive task analysis methodology: The critical decision method for the elicitation of expert knowledge. Hum. Factors, 40 , 254–276.
Hoffman, R. R., Coffey J. W. , and Ford K. M. , 2000: A case study in the research paradigm of Human-Centered Computing: Local expertise in weather forecasting. Rep. to the National Technology Alliance, Florida Institute for Human and Machine Cognition, Pensacola, FL, 827 pp.
Hoffman, R. R., Coffey J. W. , Carnot M. J. , and Novak J. D. , 2002: An empirical comparison of methods for eliciting and modeling expert knowledge. Proc. 46th Meeting of the Human Factors and Ergonomics Society, Baltimore, MD, Human Factors and Ergonomics Society, 482–486.
Hoffman, R. R., Coffey J. W. , Novak J. D. , and Cañas A. J. , 2005: Applications of concept maps to web design and web work. Handbook of Human Factors in Web Design, R. W. Proctor and K.-P. L. Vu, Eds., Erlbaum, 157–175.
Klein, G., 1992: Using knowledge engineering to preserve corporate memory. The Psychology of Expertise: Cognitive Research and Empirical AI, R. R. Hoffman, Ed., Erlbaum, 170–190.
Mandl, H., and Levin J. R. , 1989: Knowledge Acquisition from Text and Pictures. Elsevier, 329 pp.
McGraw, K., and Seale M. R. , 1988: Knowledge elicitation with multiple experts: Considerations and techniques. Artificial Intell. Rev, 2 , 31–44.
McNeese, M. D., Zaff B. S. , Brown C. E. , and Citera M. , 1993: Understanding the context of multidisciplinary design: Establishing ecological validity in the study of design problem solving. Proc. 37th Annual Meeting, Seattle, WA, Human Factors and Ergonomics Society, 1082–1086.
Mostek, A., and Coauthors, 2004: VISIT: Bringing training to Weather Service forecasters using a new distance-learning tool. Bull. Amer. Meteor. Soc, 85 , 823–829.
Novak, J. D., 1998: Learning, Creating, and Using Knowledge. Erlbaum, 264 pp.
O'Dell, C., and Grayson C. J. , 1998: If We Only Knew What We Know: The Transfer of Internal Knowledge and Best Practice. The Free Press, 238 pp.
Pliske, R., Crandall B. , and Klein G. , 2004: Competence in weather forecasting. Psychological Investigations of Competent Decision Making, K. Smith, J. Shanteau, and P. Johnson, Eds., Cambridge University Press, 40–70.
Roebber, P. J., 1998: The regime dependence of degree day forecast technique, skill, and value. Wea. Forecasting, 13 , 783–794.
Roth, E. M., 2004: Applying cognitive engineering to meteorological forecasting: From analysis of expertise to human-centered design. Proc. 48th Annual Meeting, New Orleans, LA, Human Factors and Ergonomics Society, 305–310.
Scott, R., Roth E. M. , Deutsch S. E. , Malchiodi E. , Kazmierczak T. E. , Eggleston R. G. , Kuper S. R. , and Whittaker R. D. , 2005: Work-centered support systems: A human-centered approach to intelligent system design. IEEE Intell. Syst, 20 , 2. 73–81.
Stein, E. W., 1997: A look at expertise from a social perspective. Expertise in Context, P. J. Feltovich, K. M. Ford, and R. R. Hoffman, Eds., The MIT Press, 181–194.
Stephenson, D. B., 2000: Use of the “odds ratio” for diagnosing forecast skill. Wea. Forecasting, 15 , 221–232.
Sutcliffe, A. G., 1985: Use of conceptual maps as human-computer interfaces. People and Computers: Designing the Interface, P. Johnson and S. Cook, Eds., Cambridge University Press, 117–127.
Vekirl, I., 2002: What is the value of graphical displays in learning? Educ. Psychol. Rev, 14 , 261–298.
Vicente, K., 1999: Cognitive Work Analysis: Toward Safe, Productive, and Healthy Computer-Based Work. Erlbaum, 392 pp.
Example excerpts from the career interviews. Legend: “dash-1” is the DD175-1 form used for pilot weather briefings; PO, petty officer; OJT, on-the-job training; NAS, naval air station. Certain date, location, and ship information has been replaced by Xs to ensure anonymity and confidentiality.
Discussions of what is meant by “knowledge,” including distinctions among types of knowledge (e.g., declarative procedural, declarative, tacit, etc.) can be found in Ericsson et al. (2006). Many complex issues are raised by the distinctions. Operational and conceptual definitions of expertise (and other levels of proficiency) can be found in Hoffman (1998).
The CmapTools suite has a number of functionalities that contribute to its ease of use (Cañas et al 2004). A styles palette supports the customizing of fonts, lines, colors, shapes, and so on. An autolayout function allows the mapper to generate alternative arrangements of the concepts and links in a concept map. CmapTools has a number of functionalities that contribute to its applications. An event recorder allows instructors to see how learners develop their concept maps over time. A search function looks at concept maps on a (public or private) server and finds other concept maps that contain propositions that are related to the concept map on which one is working. In a synchronous distance collaboration mode, people can create concept maps over the World Wide Web. Discussion threads can be formed, and so can a “knowledge soup” in which people can share and discuss their propositions. The concepts and propositions in concept maps can be used for narrowing searches on the Internet. There will soon be available a capability to tap into resources on the World Wide Web to assist the creation of bilingual concept maps. Also under development is a merger of the CmapTools with Web-based ontologies and common logic, for applications in the Semantic Web. Readers are invited to download CmapTools and explore the capabilities.
The process of adding resources to the concept maps represents effort beyond the creation of the knowledge model itself. In the case of STORM-LK, resources were collected opportunistically throughout the entire project. For instance, in the work space analysis we learned that one of the civilian forecasters kept a file of imagery and other data on rare/difficult cases he had encountered. Whenever a potentially useful resource was referred to, it was captured digitally and notations were made in a resource management spreadsheet, indicating appropriate places to link each resource into the knowledge model once the concept maps were finalized. We also digitally captured figures and tables from the LFH. For details on resource management for projects of this kind, see Coffey and Hoffman (2003).